Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NWM 2.1 Issue for Puerto Rico #55

Open
SorooshMani-NOAA opened this issue Dec 20, 2022 · 13 comments
Open

NWM 2.1 Issue for Puerto Rico #55

SorooshMani-NOAA opened this issue Dec 20, 2022 · 13 comments

Comments

@SorooshMani-NOAA
Copy link
Contributor

NWM writer runs into issue when there are multiple entries for the element pairing. The following function call results in an exception

src_idxs = get_aggregated_features(nc_fid0, self.pairings.sources.values())

which happens at:
in_file.append(idx.item())

when there are multiple matches in a pairing map. In my case I noticed this with a north Atlantic domain mesh, and the region of the issue was Puerto Rico reaches (not CONUS!). I'm still trying to figure out what is different between PR and CONUS reaches that causes this.

@SorooshMani-NOAA
Copy link
Contributor Author

@jreniel I noticed this issue when using NWM 2.1 vs 2.0 hydrofabric dataset. @cuill has some more information about why it happens. Due to the workload she suggests waiting until NWM 3.0 next year. These were discussed in an email between me and her. @josephzhang8 suggested that I cc you on the issue as well.

@SorooshMani-NOAA
Copy link
Contributor Author

@jreniel, adding @cuill email for reference:

The issue is Puerto Rico doesn't share the same file with CONUS. PySCHISM only downloads CONUS data. The ElementPairing is correct, but feature ids for Puerto Rico are not in the downloaded netCDF files. Therefore variable "idx" on this line

in_file.append(idx.item())
is empty.

@SorooshMani-NOAA
Copy link
Contributor Author

I was able to reproduce the issue using this script and the attached mesh:

    mesh = Hgrid.open('mesh_w_bdry.grd', crs=4326)
    nwm = NWM()
    nwm.write(
        output_directory='.',
        gr3=mesh,
        start_date=datetime.datetime(2018, 8, 28, 6, 0, tzinfo=datetime.timezone.utc),
        end_date=2.0,
        overwrite=False,
        nprocs=-1,
        product=None,
        msource=True,
        vsource=True,
        vsink=True,
        source_sink=True
        )

mesh_w_bdry.zip

@SorooshMani-NOAA SorooshMani-NOAA changed the title NWM Issue with Multiple Pairing Match NWM 2.1 Issue for Puerto Rico Dec 22, 2022
@SorooshMani-NOAA
Copy link
Contributor Author

Based on @cuill's email, the title was misleading. The issue is not multiple pairing, but it's just missing files. So I changed the issue title to reflect that.

@SorooshMani-NOAA
Copy link
Contributor Author

Just FYI, I just tested downloading NWM hydrofabric dataset using the link in nwm.py and it downloads NWM 2.1.

@jreniel
Copy link
Member

jreniel commented Dec 23, 2022

If I remember correctly, one of my pending issues was to put a guard for the case where there mesh has absolutely no intersections with the input data. The reason I did not put a guard immediately is because I was unsure if I should just print a warning or raise an exception. I am more inclined towards the exception, which makes it clear to the user that the input dataset is not consistent with the mesh. Think about whether it is more desirable to print a warning or raise an exception in this case. I vote for exception with a clear message: No intersections found between input file X and the mesh. On the other hand, if we allow for more than 1 file as input, it should not raise. It should only raise iff len(idxs) == 0.

@josephzhang8
Copy link
Member

josephzhang8 commented Dec 23, 2022 via email

@SorooshMani-NOAA
Copy link
Contributor Author

@jreniel I would prefer exception with the right message as well. However in this case the mesh does intersect with the NWM reaches. The problem (based on what I understood from @cuill) seems to be that we don't have data file for PR region for intersections of mesh with PR reaches.

@jreniel
Copy link
Member

jreniel commented Dec 24, 2022

we don't have data file for PR region for intersections of mesh with PR reaches

I see. We still need to make sure we raise clear exceptions when we know we will run into errors.
If the problem is missing data, and not the reaches, how does waiting NWM 3.0 help?
Will the PR data be bundled in the same URLs for NWM3.0 or are we still going to need to pull the data from a separate URL? If we still need to pull from separate URL, then it would be nice to post here what that URL is, or would be, for quick reference in the future.

@SorooshMani-NOAA
Copy link
Contributor Author

I haven't looked into NWM 3.0, so I don't know. I think the reasoning was that since @cuill is busy with a lot of other tasks, and NWM 3 will be released soon, it's not worth it to spend time on 2.1 and then again later on 3.0.

@cuill
Copy link
Member

cuill commented Jan 3, 2023

Here is the email from Brian (NOAA NWM):

Hi Linlin, Wei, Fei, and William--unfortunately at present, in contrast to the CONUS, we do not have a multi-year NWM Hawaii retrospective data set for distribution. That may change for NWM v3.0 (due out in mid 2023) but plans are not yet set.Currently, Hawaii data is only available via the real-time data archive. In operations, Hawaii forecasts started on June 19th, 2019. https://console.cloud.google.com/storage/browser/national-water-model/nwm.20190619?pageState=(%22StorageObjectListTable%22:(%22f%22:%22%255B%255D%22))&prefix=&forceOnObjectsSortingFiltering=false

I think the same situation for Puerto Rico. I checked google cloud, it seems that Puerto Rico forecasts started on April 20, 2020:
https://console.cloud.google.com/storage/browser/national-water-model/nwm.20210420?pageState=(%22StorageObjectListTable%22:(%22f%22:%22%255B%255D%22))&prefix=&forceOnObjectsSortingFiltering=false

@SorooshMani-NOAA
With your hgrid I got the error:
feature is 800031171
[2023-01-03 12:53:31,123] pyschism.forcing.source_sink.nwm INFO: index is []
Traceback (most recent call last):
File "gen_sourcesink.py", line 48, in
nwm.write(output_directory, hgrid, startdate, rnday, overwrite=True)
File "/sciclone/data10/lcui01/pyschism/pyschism/forcing/source_sink/nwm.py", line 957, in write
self._fetch_data(
File "/sciclone/data10/lcui01/pyschism/pyschism/forcing/source_sink/nwm.py", line 899, in _fetch_data
src_idxs = get_aggregated_features(nc_fid0, self.pairings.sources.values())
File "/sciclone/data10/lcui01/pyschism/pyschism/forcing/source_sink/nwm.py", line 339, in get_aggregated_features
in_file.append(idx.item())
ValueError: can only convert an array of size 1 to a Python scalar`

FeatureID 800031171 is in nwm_v2_1_hydrofabric_nwm_reaches_puertorico layer, however, we only downloaded the data for CONUS, which caused idx is empty:
idx=np.where(nc_feature_id == int(feature))[0]

To work around this, probably only read nwm_reaches_conus into gdf:

for reach_layer in [

@SorooshMani-NOAA
Copy link
Contributor Author

@cuill, thank you for your comment. You're right, the issue is that I read reaches info for which data files are not downloaded. This however is from the NWM 2.1 reaches dataset. NWM 2.0 doesn't have PR reaches (right?). That's why I rename the issue title to reflect that the issue occurs when using hydrofabric for NWM 2.1.

@cuill
Copy link
Member

cuill commented Jan 4, 2023

@SorooshMani-NOAA Yes, NWM2.0 doesn't have PR reaches layer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants