Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Forbid relative paths, and use file URI scheme internally? #242

Open
TomNicholas opened this issue Oct 2, 2024 · 3 comments · May be fixed by #243
Open

Forbid relative paths, and use file URI scheme internally? #242

TomNicholas opened this issue Oct 2, 2024 · 3 comments · May be fixed by #243
Labels
references formats Storing byte range info on disk

Comments

@TomNicholas
Copy link
Member

TomNicholas commented Oct 2, 2024

We currently allow manifests to contain relative local paths, e.g. test.nc. This is (a) more fragile than an absolute path /test.nc, and (b) not very consistent with cloud bucket urls, which are always absolute.

It would be more robust to ensure that paths in the Manifest are always absolute paths. (If the user wants to move their data around they can always use the .rename_paths method to explicitly adjust the paths in their manifest to point to the files' new locations.)

If all local paths are to be stored as absolute paths then we also might want to use the file URI scheme, so that local paths are stored as file:///test.nc. That way in the Manifest the form of the path is consistent, whether it is local or remote. However as kerchunk does not use this scheme, it will mean extra conversion steps are needed to go between the two formats.

cc @mpiannucci

@mdsumner
Copy link
Contributor

mdsumner commented Oct 2, 2024

Oh nice, this has caught me out and I'm glad to see it laid out this way.

@mpiannucci
Copy link
Contributor

Ah I was wondering what kerchunk does because according to docs fsspec supports the file:// scheme. Thanks for this

@TomNicholas
Copy link
Member Author

TomNicholas commented Oct 2, 2024

I haven't tried to use fsspec to read data from references that use a file:/// prefix, but the kerchunk readers return dicts containing references that use relative paths. So coercing those relative paths into absolute URIs does break some of VirtualiZarr's kerchunk roundtripping tests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
references formats Storing byte range info on disk
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants