Directly using a DataArray in the RasterScanner() #27

TBuskop · 2023-08-02T08:40:24Z

When using the DamageScanner RasterScanner() function I found it usefull to directly use a DataArray of loaded and manipulated datasets. This way manipulations do not have to be saved to a .tif file before they are given to the RasterScanner() function. I placed the altered code below if this is something you might want to add.

def RasterScanner(landuse_file,
                  hazard_file,
                  curve_path,
                  maxdam_path,
                  lu_crs=28992,
                  haz_crs=4326,                   
                  hazard_col='FX',                  
                  dtype = np.int32,
                  save=False,
                  **kwargs):
    """
    Raster-based implementation of a direct damage assessment.
    
    Arguments:
        *landuse_file* : GeoTiff of DataArray shape (y, x) with land-use information per grid cell. Make sure 
        the land-use categories correspond with the curves and maximum damages 
        (see below). Furthermore, the resolution and extend of the land-use map 
        has to be exactly the same as the inundation map. When using DataArray make sure that the map has the same crs and resolution
        as the hazard map since this is not adjusted for this type.
     
        *hazard_file* : GeoTiff, netCDF4, or DataArray of shape (y, x) with hazard intensity per grid cell. Make sure 
        that the unit of the hazard map corresponds with the unit of the 
        first column of the curves file. When using DataArray make sure that the map has the same crs and resolution
        as the landuse map since this is not adjusted for this type.
     
        *curve_path* : File with the stage-damage curves of the different 
        land-use classes. Values should be given as ratios, i.e. between 0 and 1.
        Can also be a pandas DataFrame or numpy Array.
     
        *maxdam_path* : File with the maximum damages per land-use class 
        (in euro/m2). Can also be a pandas DataFrame or numpy Array.
                        
        *dtype*: Set the dtype to the requires precision. This will affect the output damage raster as well 
     
    Optional Arguments:
        *save* : Set to True if you would like to save the output. Requires 
        several **kwargs**
        
    kwargs:
        *nan_value* : if nan_value is provided, will mask the inundation file. 
        This option can significantly fasten computations
        
        *cell_size* : If both the landuse and hazard map are numpy arrays, 
        manually set the cell size.

        *resolution* : If landuse is a numpy array, but the hazard map 
        is a netcdf, you need to specify the resolution of the landuse map.

        *output_path* : Specify where files should be saved.
        
        *scenario_name*: Give a unique name for the files that are going to be saved.
        
        *in_millions*: Set to True if all values should be set in millions.
        
        *crs*: Specify crs if you only read in two numpy array
        
        *transform*: Specify transform if you only read in numpy arrays in order to save the result raster
        
    Raises:
        *ValueError* : on missing kwarg options

    Returns:    
     *damagebin* : Table with the land-use class numbers (1st column) and the 
     damage for that land-use class (2nd column).
     
     *damagemap* : Map displaying the damage per grid cell of the area.
     
    """
    # load land-use map
    if isinstance(landuse_file, str):
        with rasterio.open(landuse_file) as src:
            landuse = src.read()[0, :, :]
            transform = src.transform
            resolution = src.res[0]
            cellsize = src.res[0] * src.res[1]
    else:
        landuse = landuse_file.copy()

    landuse_in = landuse.copy()
        
    # Load hazard map
    if isinstance(hazard_file, str):
        if (hazard_file.endswith('.tif') | hazard_file.endswith('.tiff')):      
            with rasterio.open(hazard_file) as src:
                hazard = src.read()[0, :, :]
                transform = src.transform
                
        elif hazard_file.endswith('.nc'):

            # Open the hazard netcdf file and store it in the hazard variable
            hazard = xr.open_dataset(hazard_file)
    
            # Open the landuse geotiff file and store it in the landuse variable
            landuse = xr.open_dataset(landuse_file, engine="rasterio")

            # Match raster to vector
            hazard,landuse = match_raster_to_vector(hazard,landuse,lu_crs,haz_crs,resolution,hazard_col)  

    # if hazard file is a not a string check if it is a xr.DataArray
    elif isinstance(hazard_file, xr.DataArray):
        
        # Check if landuse is also a xr.Dataset
        if isinstance(landuse_file,xr.DataArray):
            landuse = landuse_file.values
            hazard = hazard_file.values

        # if the landuse file is a string, open it as a rasterio file
        else:
            # Open the landuse geotiff file and store it in the landuse variable
            landuse = xr.open_dataset(landuse_file, engine="rasterio")
            # Match raster to vector
            hazard,landuse = match_raster_to_vector(hazard_file,landuse,lu_crs,haz_crs,resolution,hazard_col)
     

    else:
        hazard = hazard_file.copy()

    # check if land-use and hazard map have the same shape.
    if landuse.shape != hazard.shape:
        # if one of the files are a DataArray, the matching prodecure does not work and it will need to be done manually by the user
        if (type(landuse) == xr.DataArray) or (type(hazard) == xr.DataArray):
            raise ValueError(f"The landuse map {landuse.shape} and hazard map {hazard.shape} are not the same shape. Fix that first. Or save both files as .tif and try again.")
        
        warnings.warn(
            "WARNING: landuse and hazard maps are not the same shape. Let's fix this first manually and try again!"
        )
        
        landuse, hazard, intersection = match(landuse_file, hazard_file)

        # create the right affine for saving the output
        transform = Affine(transform[0], transform[1], intersection[0],
                           transform[3], transform[4], intersection[1])

    # set cellsize:
    if isinstance(landuse_file, str) | isinstance(hazard_file, str):
        cellsize = src.res[0] * src.res[1]
    elif isinstance(landuse_file, xr.DataArray) & isinstance(hazard_file, xr.DataArray):
        cellsize = abs(landuse_file.rio.resolution()[0] * landuse_file.rio.resolution()[1])
    else:
        try:
            cellsize = kwargs['cellsize']
        except KeyError:
            raise ValueError("Required `cellsize` not given.")

    # Load curves
    if isinstance(curve_path, pd.DataFrame):
        curves = curve_path.values
    elif isinstance(curve_path, np.ndarray):
        curves = curve_path
    elif curve_path.endswith('.csv'):
        curves = pd.read_csv(curve_path).values
    
    if ((curves>1).all()) or ((curves<0).all()):
        raise ValueError("Stage-damage curve values must be between 0 and 1")

    # Load maximum damages
    if isinstance(maxdam_path, pd.DataFrame):
        maxdam = maxdam_path.values
    elif isinstance(maxdam_path, np.ndarray):
        maxdam = maxdam_path
    elif maxdam_path.endswith('.csv'):
        maxdam = pd.read_csv(maxdam_path).values
    
    if maxdam.shape[0] != (curves.shape[1]-1):
        raise ValueError("Dimensions between maximum damages and the number of depth-damage curve do not agree")
  
    # Speed up calculation by only considering feasible points
    if kwargs.get('nan_value'):
        nan_value = kwargs.get('nan_value')
        hazard[hazard == nan_value] = 0

    haz = hazard * (hazard >= 0) + 0   
    haz[haz >= curves[:, 0].max()] = curves[:, 0].max()
        
    area = haz > 0
    haz_intensity = haz[haz > 0]
    landuse = landuse[haz > 0]

    # Calculate damage per land-use class for structures
    numberofclasses = len(maxdam)
    alldamage = np.zeros(landuse.shape[0])
    damagebin = np.zeros((
        numberofclasses,
        2,
    ))
    for i in range(0, numberofclasses):
        n = maxdam[i, 0]
        damagebin[i, 0] = n
        wd = haz_intensity[landuse == n]
        alpha = np.interp(wd, ((curves[:, 0])), curves[:, i + 1])
        damage = alpha * (maxdam[i, 1] * cellsize)
        damagebin[i, 1] = sum(damage)
        alldamage[landuse == n] = damage
    
    # create the damagemap
    damagemap = np.zeros((area.shape[0], area.shape[1]), dtype=dtype)
    damagemap[area] = alldamage

    # create pandas dataframe with output
    damage_df = pd.DataFrame(damagebin.astype(dtype),
                           columns=['landuse',
                                    'damages']).groupby('landuse').sum()
    
    # create damage rioxarray object with the right crs and transform
    if isinstance(hazard_file, xr.DataArray):
        damagemap_rio = hazard_file
        damagemap_rio.values = damagemap

        # rename values attribute to damages
        # clear attrs
        damagemap_rio.attrs = {}
        # set attrs
        damagemap_rio.attrs['unit'] = '€'
        damagemap_rio.attrs['long_name'] = 'Damage'
 
    if save:
        if isinstance(hazard_file, xr.DataArray):
            crs = hazard_file.rio.crs
            transform = hazard_file.rio.transform()
        else:
            crs = kwargs.get('crs', src.crs)
            transform = kwargs.get('transform', transform)

        # requires adding output_path and scenario_name to function call
        # If output path is not defined, will place file in current directory
        output_path = check_output_path(kwargs)
        scenario_name = check_scenario_name(kwargs)
        path_prefix = p_join(output_path, scenario_name)

        damage_fn = '{}_damages.csv'.format(path_prefix)
        damage_df.to_csv(damage_fn)

        dmap_fn = '{}_damagemap.tif'.format(path_prefix)
        rst_opts = {
            'driver': 'GTiff',
            'height': damagemap.shape[0],
            'width': damagemap.shape[1],
            'count': 1,
            'dtype': dtype,
            'crs': crs,
            'transform': transform,
            'compress': "LZW"
        }
        with rasterio.open(dmap_fn, 'w', **rst_opts) as dst:
            dst.write(damagemap, 1)

    if 'in_millions' in kwargs:
        damage_df = damage_df / 1e6

    # return output
    if isinstance(hazard_file, xr.DataArray):
        # return damage map as DataArray
        return damage_df, damagemap_rio, landuse_in, hazard
    else:
        return damage_df, damagemap, landuse_in, hazard

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Directly using a DataArray in the RasterScanner() #27

Directly using a DataArray in the RasterScanner() #27

TBuskop commented Aug 2, 2023

Directly using a DataArray in the RasterScanner() #27

Directly using a DataArray in the RasterScanner() #27

Comments

TBuskop commented Aug 2, 2023