Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Systematic identification of variables from an xarray Dataset #886

Open
jthielen opened this issue Jul 10, 2018 · 4 comments
Open

Systematic identification of variables from an xarray Dataset #886

jthielen opened this issue Jul 10, 2018 · 4 comments
Labels
Area: Xarray Pertains to xarray integration Type: Feature New functionality

Comments

@jthielen
Copy link
Collaborator

Corresponding to #860, it would seem useful to also be able to systematically identify variables from an xarray Dataset. A simple use-case would be something like what motivated this issue, #662, where we want to identify each of the components of the 3D wind field and then do some calculations on those. This also would likely be a prerequisite for #3 (whenever enough pieces are in place for that to be implemented).

A initial approach could be simply searching for the standard_name attribute and strictly adhering to the CF Standard Name list, while giving some option to the user to supply a dictionary to fill standard names where they are missing. However, would there be cases where we don't have a CF standard name for the quantity we want? Or, should there be some kind of automatic processing to fill in for missing standard_name attributes? But, then again, anything too much more flexible/complex would likely become even messier than systematic coordinate identification ended up being.

@jthielen jthielen added Type: Feature New functionality Area: Xarray Pertains to xarray integration labels Jul 10, 2018
@dopplershift
Copy link
Member

Is it worth allowing overwriting rather than telling people to do:

my_data.attrs()['standard_name'] = 'air_temperature'

@jthielen
Copy link
Collaborator Author

jthielen commented Jul 11, 2018

I'm not sure...that definitely seems like a good approach for DataArrays, but since (I'd presume) this is most useful on Datasets, that approach would end up like

my_data['temperature_isobaric'].attrs()['standard_name'] = 'air_temperature'
my_data['relative_humidity_isobaric'].attrs()['standard_name'] = 'relative_humidity'
my_data['geopotential_height_isobaric'].attrs()['standard_name'] = 'geopotential_height'

versus something less verbose such as

data.metpy.parse_cf(variables={'temperature_isobaric': 'air_temperature',
                               'relative_humidity_isobaric': 'relative_humidity',
                               'geopotential_height_isobaric': 'geopotential_height'})

I'd prefer the second, but what do you think?

@dopplershift
Copy link
Member

Do you have a current set of data where this feature is necessary?

@jthielen
Copy link
Collaborator Author

jthielen commented Jul 11, 2018

In regards to the feature of systematic identification itself, it's mostly just the motivating example mentioned above at this point, but I could also see it opening up possibilities for calculations in the future if the user just passed a dataset, and the function could pull out what it needed.

In regards to filling in the standard_name, most sets of data I've been working with would need this, especially since most of the GRIB-converted data coming from THREDDS servers I've used are missing the standard_name attribute (this includes the NARR and Irma GFS examples in staticdata). Also, no surprise, but non-post-processed WRF output seems to lack it as well.

But, based on actually looking into this now and finding how common it is for datasets to be missing the standard_name attribute, would it be necessary to have programmatic ways of identifying the type of variable for this to be practical? Or is a different approach not based on standard_name needed? (If so either way, it seems like something that would take too much effort to be worked on right now.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: Xarray Pertains to xarray integration Type: Feature New functionality
Projects
None yet
Development

No branches or pull requests

2 participants