Cached NWM data still requires internet access #123

SorooshMani-NOAA · 2024-04-08T20:18:07Z

Using ensembleperturbation we setup many SCHISM run directories. Since this can be compute intensive for large ensembles we run the ensemble generation on compute nodes (no internet). In order to avoid the issue of downloading the data from compute node, I first setup a single run to cache the data and then setup the rest on the compute node.

At first I thought this solution should work, but then I realized PySCHISM still needs to have internet because the caching kicks in for data, but still metadata is fetched from the internet every time. See:
Metadata

pyschism/pyschism/forcing/source_sink/nwm.py

Lines 490 to 498 in fc41b51

    
           paginator = self.s3.get_paginator("list_objects_v2") 
        
           pages = paginator.paginate( 
        
               Bucket=self.bucket, Prefix=f"model_output/{year}" 
        
           ) 
        
           self.data = [] 
        
           for page in pages: 
        
               for obj in page["Contents"]: 
        
                   self.data.append(obj)

and
Data

pyschism/pyschism/forcing/source_sink/nwm.py

Lines 524 to 525 in fc41b51

    
           cached_file = list(self.tmpdir.glob(f'**/{filename.name}')) 
        
           if len(cached_file) == 1:

The text was updated successfully, but these errors were encountered:

SorooshMani-NOAA · 2024-04-08T20:19:23Z

As a short term solution I can just copy paste the downloaded data from one setup directory to another, but since I'm using automation, around pyschism I was hoping to use the built-in caching capability in this case.

SorooshMani-NOAA mentioned this issue Apr 9, 2024

Allow for custom NWM cache dir noaa-ocs-modeling/CoupledModelDriver#164

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cached NWM data still requires internet access #123

Cached NWM data still requires internet access #123

SorooshMani-NOAA commented Apr 8, 2024

SorooshMani-NOAA commented Apr 8, 2024

Cached NWM data still requires internet access #123

Cached NWM data still requires internet access #123

Comments

SorooshMani-NOAA commented Apr 8, 2024

SorooshMani-NOAA commented Apr 8, 2024