Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Paths as URIs #243

Open
wants to merge 13 commits into
base: main
Choose a base branch
from
Open

Paths as URIs #243

wants to merge 13 commits into from

Conversation

TomNicholas
Copy link
Member

@TomNicholas TomNicholas commented Oct 2, 2024

This PR closes #242 at the data model level - all paths are coerced to absolute URIS (i.e. file:///directory/test.nc or s3://bucket/test.nc) as they go into the Manifest.

As this forbids constructing manifests using relative paths, it requires minor changes to many tests (e.g. test.nc-> /test.nc). It also will require slightly more invasive changes to any tests that involve kerchunk references.

@TomNicholas TomNicholas marked this pull request as ready for review October 19, 2024 02:33
@TomNicholas
Copy link
Member Author

What should we do with the paths in kerchunk references? Are they are always meant as absolute? I guess we should assume they are absolute, unless they have ./ or ../ at the start? Would fsspec ever produce relative paths like that?

cc @martindurant

@martindurant
Copy link
Member

Are they are always meant as absolute?

They are always meant "as interpreted by the target filesystem". The nature of that filesystem might be implied by the protocol of a path alone, but commonly additional arguments are also required. This means, that relative paths do work if the target happens to be the local filesystem (file://), but I think of the other filesystems, only ssh supports this concept at all. I would not expect this to be meaningful for basically any practical case.

Note that the dir:// filesystem adds prefixes to URLs for any filesystem, if that's useful at all.

@martindurant
Copy link
Member

(I am happy to require absolute paths even if it makes some tests slightly more verbose)

@TomNicholas
Copy link
Member Author

Thanks @martindurant !

The nature of that filesystem might be implied by the protocol of a path alone, but commonly additional arguments are also required.

But is the nature of the filesystem explicitly recorded in the kerchunk references format anywhere? Obviously if the prefix is explicit (e.g. s3://) then you know but otherwise?

that relative paths do work

I guess we should assume they are absolute, unless they have ./ or ../ at the start?

Would this approach work then?

(I am happy to require absolute paths even if it makes some tests slightly more verbose)

This might be helpful if the above approach doesn't work.

@martindurant
Copy link
Member

is the nature of the filesystem explicitly recorded in the kerchunk references format

No. The original intention was to have these in the "templates", but in practice, the remote_protocol, remote_options and fss arguments to ReferenceFileSystem are used (and often encoded in Intake prescriptions) in cases of ambiguity.

Comment on lines +50 to +52
raise ValueError(
f"paths in the manifest must be absolute, but got {path}"
) from e
Copy link
Member Author

@TomNicholas TomNicholas Oct 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FixMe: This often gets caught by a try-except above it, leading to a less specific error being raised.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Forbid relative paths, and use file URI scheme internally?
2 participants