Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add optional / example plugin that catches common issues #170

Open
aaraney opened this issue Aug 21, 2024 · 1 comment
Open

Add optional / example plugin that catches common issues #170

aaraney opened this issue Aug 21, 2024 · 1 comment
Labels
enhancement New feature or request ngen.cal Related to ngen.cal package

Comments

@aaraney
Copy link
Member

aaraney commented Aug 21, 2024

There are a number of places ngen.cal can fail that are out of ngen.cal's control. Many of these issues are known just ngen.cal can't resolve the issue and continue with execution. ngen.cal doesn't report many of these issues and instead its often up the person conducting the evaluation to either debug the issue or reach out to one of the ngen.cal maintainers to find a resolution.

At a high level a calibration exercise looks like: ngen.cal runs ngen in a subprocess, evaluates its output (really t-route's output) vs. observations, adjusts parameters based on the evaluation score, writes a new ngen realization config, and re-starts the process. The configuration, data, and software you likely would need for a calibration exercise are:

data

  • atmospheric forcing files (csv or netcdf)
  • hydrofabric (gpkg or geojson)
  • possibly usgs nudging files (netcdf)

software

  • ngen
  • t-route
  • python environment(s)
  • Exercise relevant bmi shared libraries or bmi python modules

configuration

  • ngen realization config (json)
  • t-route config (yaml)
  • ngen.cal config (yaml)
  • bmi module specific (and likely catchment specific) init_config files (any format)
    • possibly bmi module specific init_config file's dependent files (e.g. init_config file links to another file on disk)

There are many axes for things to go wrong. Some examples are, relative paths in realization config, setting output_root in realization config, or misconfiguring t-route's output to name a few. An ngen.cal plugin could cause ngen.cal to fail and provide helpful information to alleviate common pain points and provide a pathway for improving user interaction with the software.

@aaraney aaraney added enhancement New feature or request ngen.cal Related to ngen.cal package labels Aug 21, 2024
@aaraney
Copy link
Member Author

aaraney commented Aug 23, 2024

Example saving model and observation output:
from __future__ import annotations

import typing

from ngen.cal import hookimpl
from hypy.nexus import Nexus
import pandas as pd

if typing.TYPE_CHECKING:
    from datetime import datetime
    from ngen.cal.meta import JobMeta


class SaveOutput:
    def __init__(self) -> None:
        self.sim: pd.Series | None = None
        self.obs: pd.Series | None = None
        self.first_iteration: bool = True

    @hookimpl(wrapper=True)
    def ngen_cal_model_observations(
        self,
        nexus: Nexus,
        start_time: datetime,
        end_time: datetime,
        simulation_interval: pd.Timedelta,
    ) -> typing.Generator[None, pd.Series, pd.Series]:
        # In short, all registered `ngen_cal_model_observations` hooks run
        # before `yield` and the results are sent as the result to `yield`
        # NOTE: DO NOT MODIFY `obs`
        obs = yield
        if self.first_iteration and obs is None:
           self.first_iteration = False
           return None
        assert isinstance(obs, pd.Series), f"expected pd.Series, got {type(obs)!r}"
        self.obs = obs
        return obs

    @hookimpl(wrapper=True)
    def ngen_cal_model_output(
        self, id: str | None
    ) -> typing.Generator[None, pd.Series, pd.Series]:
        # In short, all registered `ngen_cal_model_output` hooks run
        # before `yield` and the results are sent as the result to `yield`
        # NOTE: DO NOT MODIFY `sim`
        sim = yield
        assert isinstance(sim, pd.Series), f"expected pd.Series, got {type(sim)!r}"
        self.sim = sim
        return sim

    @hookimpl
    def ngen_cal_model_iteration_finish(self, iteration: int, info: JobMeta) -> None:
        if self.sim is None:
            return None
        assert (
            self.sim is not None
        ), "make sure `ngen_cal_model_output` was called"
        assert self.obs is not None, "make sure `ngen_cal_model_observations` was called"
        # index: hourly datetime
        # columns: `obs_flow` and `sim_flow`; units m^3/s
        df = pd.merge(self.sim, self.obs, left_index=True, right_index=True)
        df.reset_index(names="time", inplace=True)
        df.to_parquet(f"sim_obs_{iteration}.parquet")

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request ngen.cal Related to ngen.cal package
Projects
None yet
Development

No branches or pull requests

1 participant