Skip to content

Commit

Permalink
TRT-552 - Implement configuration file schema v1.0.0
Browse files Browse the repository at this point in the history
* TRT-553 - Read all group metadata attributes
* TRT-554 - Flatten overrides and supplements
* TRT-555 - Remove CF_Supplements
* TRT-556 - Remove ProductEpochs and Grid_Mapping_Data
* TRT-556 - Rename CFOverrides to MetadataOverrides
  • Loading branch information
owenlittlejohns authored Sep 13, 2024
1 parent bccf65b commit ccca299
Show file tree
Hide file tree
Showing 23 changed files with 2,335 additions and 738 deletions.
43 changes: 39 additions & 4 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,21 +1,56 @@
## v3.0.0
### 2024-09-11

The configuration file schema for `earthdata-varinfo` is significantly updated
in this release. For more information, see the release notes for schema v1.0.0
in `config/CHANGELOG.md`.

### Added:

* Groups within a NetCDF-4 or DMR file are now assigned to the `VarInfo*.groups`
dictionary, allowing for their metadata attributes to be accessed after parsing
an input file.

### Changed:

* `CFConfig.get_cf_attributes` has been renamed `CFConfig.get_metadata_overrides`,
as there are now only overrides to be returned from this method. Calls to
`CFConfig.get_metadata_overrides` now _must_ specify a variable path. All
overrides from a configuration file for a given collection are now retrievable
from the newly public `CFConfig.metadata_overrides` class attribute.
* Metadata overrides retrieved for a matching file path are ordered such that
the most specific applicable override to the variable takes precedence. For
example, when requesting the value of the "units" metadata attribute for
variable "/nested/variable", an applicability rule that exactly matches this
variable path will take precedence over rules matching to either the group,
or all variables in the file.
* Handling of nested `Applicability_Groups` has been removed from the `CFConfig`
class, as the configuration file no longer nests these items in overrides.

### Removed:

* `CFConfig._cf_supplements` has been deprecated in favour of specifying all
in-file metadata changes via a `MetadataOverrides` item (formerly
`CFOverrides`) instead.

## v2.3.0
### 2024-08-26

The VarInfoBase.get_missing_variable_attributes method has been added to allow
The `VarInfoBase.get_missing_variable_attributes` method has been added to allow
someone to get metadata attributes from the configuration file for variables
that are absent from a file. An example usage is when a CF Convention grid
mapping variable is missing from a source file.
The VarInfoBase.get_references_for_attribute method has been added to retrieve
The `VarInfoBase.get_references_for_attribute` method has been added to retrieve
all unique variable references contained in a single metadata attribute for a
list of variables. For example, retrieving all references listed under the
coordinates metadata attribute.

## v2.2.2
### 2024-07-16

The generate_collection_umm_var function in earthdata-varinfo updated to support an
optional kwarg 'config_file=' for a configuration file, to be able to override known metadata errors.
The `generate_collection_umm_var` function in earthdata-varinfo updated to
support an optional kwarg `config_file` for a configuration file, to be able to
override known metadata errors.


## v2.2.1
Expand Down
14 changes: 8 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,13 +24,14 @@ attributes.
from varinfo import CFConfig
cf_config = CFConfig('ICESat2', 'ATL03', config_file='config/0.0.1/sample_config_0.0.1.json')
cf_attributes = cf_config.get_cf_attributes('/full/variable/path')
metadata_attributes = cf_config.get_metadata_attributes('/full/variable/path')
```

### VarInfo

A group of classes that contain the relations between all variables within a
single granule. Current classes include:
A group of classes that contain metadata attributes for all groups and
variables in a single granule, and the relations between all variables within
that granule. Current classes include:

* VarInfoBase: An abstract base class that contains core logic and methods used
by the child classes that parse different sources of granule information.
Expand Down Expand Up @@ -66,9 +67,10 @@ var_info.get_spatial_dimensions({'/path/to/science/variable'})

The `VarInfoFromDmr` and `VarInfoFromNetCDF4` classes also have an optional
argument `short_name`, which can be used upon instantiation to specify the
short name of the collection to which the granule belongs. This option is to be
used when a granule does not contain the collection short name within its
metadata global attributes (e.g., ABoVE collections from ORNL).
short name of the collection to which the granule belongs. This option is the
preferred way to specify a collection short name, and particularly encouraged
for use when a granule does not contain the collection short name within its
metadata attributes (e.g., ABoVE collections from ORNL).

```
var_info = VarInfoFromDmr('/path/to/local/file.dmr', short_name='ATL03')
Expand Down
2 changes: 1 addition & 1 deletion VERSION
Original file line number Diff line number Diff line change
@@ -1 +1 @@
2.3.0
3.0.0
154 changes: 154 additions & 0 deletions config/1.0.0/earthdata_varinfo_configuration_schema.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,154 @@
{
"$schema": "http://json-schema.org/draft/2020-12/schema",
"title": "earthdata-varinfo configuration file",
"description": "A schema for the configuration file used by earthdata-varinfo to augment CF-Convention metadata in granules.",
"type": "object",
"additionalProperties": false,
"properties": {
"Identification": {
"description": "A description indicating the tool for which earthdata-varinfo and this configuration file will be used.",
"type": "string",
"minLength": 1
},
"Version": {
"description": "A numeric identifier for the version of the specific configuration file (not the schema version itself).",
"type": "integer"
},
"CollectionShortNamePath": {
"description": "A list of HDF metadata attribute paths that provide the shortname value of the collection for the data file being processed. Processed in the listed order.",
"type": "array",
"items": {
"type": "string",
"minLength": 1
}
},
"Mission": {
"description": "A set of mission names that are defined for matching short name values.",
"type": "object",
"additionalProperties": {
"type": "string"
}
},
"ExcludedScienceVariables": {
"description": "VarInfo classes currently assume that any variable that has a grid mapping attribute, or has a spatial or temporal dimension and is not itself a dimension or bounds variable, should be treated as a science variable. This may not be true in all cases, and so ExcludedScienceVariables provide a method to denote non-science variables that might otherwise be incorrectly identified.",
"type": "array",
"items": {
"$ref": "#/$defs/MissionVariablePatternType"
}
},
"RequiredVariables": {
"description": "# VarInfo classes will calculate a set of required variables for a given science variable. This setting imposes additional contents for the required variables list.",
"type": "array",
"items": {
"$ref": "#/$defs/MissionVariablePatternType"
}
},
"MetadataOverrides": {
"description": "# For cases where CF references do not exist, or are invalid. For example, variables that have no dimension references in the HDF-5 file contents",
"type": "array",
"items": {
"$ref": "#/$defs/MetadataOverridesItemType"
}
}
},
"required": ["Identification", "Version", "CollectionShortNamePath", "Mission"],
"$defs": {
"ApplicabilityType": {
"description": "An object that specifies a combination of satellite mission, collection short name and variable patterns to which a set of attributes should be applied. At least one of those properties must be specified.",
"type": "object",
"properties": {
"Mission": {
"description": "The name of a mission to which the attributes can be applied. This mission name should match one listed in the Mission mapping of this schema.",
"type": "string"
},
"ShortNamePath": {
"description": "The short name for the collection to which a granule belongs.",
"type": "string"
},
"VariablePattern": {
"description": "A regular expression identifying all variables to which the schema item should be applied.",
"type": "string"
}
},
"anyOf": [{
"required": ["Mission"]
}, {
"required": ["ShortNamePath"]
}],
"additionalProperties": false
},
"AttributesItemType": {
"description": "An object that includes the name and value that should be used to either extend or overwrite a metadata attribute for applicable variables.",
"type": "object",
"properties": {
"Name": {
"description": "The metadata attribute name.",
"type": "string"
},
"Value": {
"description": "The overriding metadata attribute value. The value specified in the configuration file will replace the corresponding metadata value in any applicable source file.",
"anyOf": [{
"type": ["number", "string"]
}, {
"type": "array",
"items": {
"type": "number"
}
}]
}
},
"required": ["Name", "Value"],
"additionalProperties": false
},
"AttributesType": {
"description": "A list of metadata attributes to be updated for variables identified by the applicability rule.",
"type": "array",
"items": {
"$ref": "#/$defs/AttributesItemType"
}
},
"MissionVariablePatternType": {
"description": "An object that defines a list of variables, as strings or regular expressions, that should be considered as either required variables or excluded as science variables for a given collection.",
"type": "object",
"properties": {
"Applicability": {
"description": "The mission and/or collection short name to which the list of required variables or excluded variables should be applied.",
"$ref": "#/$defs/ApplicabilityType"
},
"VariablePattern": {
"description": "A list of variable strings or regular expression patterns that should match variables to be excluded or required for a given collection or mission.",
"type": "array",
"items": {
"type": "string"
}
}
},
"required": ["Applicability", "VariablePattern"],
"additionalProperties": false
},
"MetadataOverridesItemType": {
"description": "An item that details one or more metadata attributes to overwrite according to the supplied applicability rules.",
"type": "object",
"properties": {
"_Description": {
"description": "Explains the purpose and effect of these overrides.",
"type": "string"
},
"Applicability": {
"description": "An applicability rule that indicates which groups and variables within a file a metadata override should apply to. If only a short name and/or mission is provided, the override will apply to all groups and variables. If a VariablePattern is also provided, the override is applied only to those groups or variables whose paths match the regular expression of the VariablePattern.",
"$ref": "#/$defs/ApplicabilityType"
},
"Attributes" : {
"description": "Metadata attributes to override for variables or groups that match the mission, short name and/or VariablePattern criteria specified in the Applicability of this object.",
"type": "array",
"items": {
"description": "A list of metadata attributes with their names and values.",
"$ref": "#/$defs/AttributesItemType"
}
}
},
"additionalProperties": false,
"required": ["Applicability", "Attributes"]
}
}
}
Loading

0 comments on commit ccca299

Please sign in to comment.