Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

not required column and un-ordered columns in add_resource() #254

Closed
Rafnuss opened this issue Aug 21, 2024 · 2 comments
Closed

not required column and un-ordered columns in add_resource() #254

Rafnuss opened this issue Aug 21, 2024 · 2 comments

Comments

@Rafnuss
Copy link

Rafnuss commented Aug 21, 2024

I'm not expecting an error message when adding a resource which has a column missing which is not required according to the schema.

Here [https://raw.githubusercontent.com/Rafnuss/GeoLocator-DP/main/measurements-table-schema.json] does not require the valid column, which is missing in my data.

library(frictionless)
create_package() |>
  add_resource(
    "measurements",
    data.frame(
      "tag_id" = "18LY",
      "sensor" = "pressure", 
      "datetime" = "2020-05-01", 
      "value" = 12
    ),
    schema = jsonlite::read_json("https://raw.githubusercontent.com/Rafnuss/GeoLocator-DP/main/measurements-table-schema.json"))
#> Error in `check_schema()`:
#> ! Field names in `schema` must match column names in `data`.
#> ℹ Field names: "tag_id", "sensor", "datetime", "value", and "valid".
#> ℹ Column names: "tag_id", "sensor", "datetime", and "value".

Also I am not sure why providing in the same order than in the schema is necessary. Is it no possible to re-order the data according to schema?

library(frictionless)
create_package() |>
  add_resource(
    "measurements",
    data.frame(
      "tag_id" = "18LY",
      "sensor" = "pressure", 
      "datetime" = "2020-05-01", 
      "valid" = F,
      "value" = 12
    ),
    schema = jsonlite::read_json("https://raw.githubusercontent.com/Rafnuss/GeoLocator-DP/main/measurements-table-schema.json"))
#> Error in `check_schema()`:
#> ! Field names in `schema` must match column names in `data`.
#> ℹ Field names: "tag_id", "sensor", "datetime", "value", and "valid".
#> ℹ Column names: "tag_id", "sensor", "datetime", "valid", and "value".
Rafnuss added a commit to Rafnuss/frictionless-r that referenced this issue Aug 21, 2024
@Rafnuss
Copy link
Author

Rafnuss commented Aug 21, 2024

Actually reading more on this , I realised this dependant on fieldsMatch. Maybe a more complex solution is required?

@peterdesmet
Copy link
Member

Hi @Rafnuss, you (and many others, including me) want optional and reordered fields.

This feature that is not supported in Data Package 1.0, which is the version that frictionless currently implements. So right now, you (annoyingly) need to add all columns in your data, even if those are empty. Or you will need to do some preprocessing on your schema before adding it to your resource.

The feature has indeed been added as fieldsMatch in Data Package 2.0. Frictionless currently doesn't support 2.0 yet, but we aim to do so (including fieldMatch). Fully supporting v2 is a daunting task though, so it won't be soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants