Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

get_child_links/get_item_links: Ensure correct media type #1255

Open
m-mohr opened this issue Oct 5, 2023 · 8 comments
Open

get_child_links/get_item_links: Ensure correct media type #1255

m-mohr opened this issue Oct 5, 2023 · 8 comments
Milestone

Comments

@m-mohr
Copy link
Contributor

m-mohr commented Oct 5, 2023

It looks like e.g get_child_links doesn't check the media type.

So if I have child links to STAC Catalogs/Collections and to HTML files (that render the child STAC entities), is it intentional that I also get the HTML files? I see that a lot in OGC API-based implementations.

Generally, does pystac support hierarchical links (child, item, self, parent) with types that are not STAC media types? I mean support in a sense that it doesn't screw up or doesn't throw an error (e.g. by ignoring them).

@m-mohr m-mohr changed the title get_child_links: Ensure correct media type? get_child_links/get_item_links: Ensure correct media type? Oct 5, 2023
@gadomski
Copy link
Member

gadomski commented Oct 5, 2023

Generally, does pystac support hierarchical links (child, item, self, parent) with types that are not STAC media types? I mean support in a sense that it doesn't screw up or doesn't throw an error (e.g. by ignoring them).

Probably not, and this seems like strange/incorrect per the spec wording, which says that a child rel type should be "URL to a child STAC entity (Catalog or Collection)." So it would be surprising to me if a thing behind child was NOT a STAC entity.

@m-mohr
Copy link
Contributor Author

m-mohr commented Oct 5, 2023

It is a STAC entity, but in a different encoding (i.e. HTML).

I understood the STAC Spec as such that you should still have the corresponding media type to indicate that this thing is a STAC entity.

This all is pretty interesting when conbining STAC with other worlds, like Records, etc. Other things can be items or children in a hierarchical sense, IMHO. Taking the whole relation type just for us, seems like a bold claim.

@m-mohr
Copy link
Contributor Author

m-mohr commented Oct 5, 2023

Related: radiantearth/stac-spec#1259

@m-mohr
Copy link
Contributor Author

m-mohr commented Oct 5, 2023

Here's an example of such an implementation: https://api.weather.gc.ca/stac/?f=json

@gadomski
Copy link
Member

gadomski commented Oct 5, 2023

It is a STAC entity, but in a different encoding (i.e. HTML).

Is this defined somewhere? HTML in particular seems like a strange format for machine-readable STAC metadata. The general point though is taken -- I've encoded STAC metadata in TOML, e.g.

Should this issue then be "support other media types for structural links", with deserializers for any other formats that are out there?

@m-mohr
Copy link
Contributor Author

m-mohr commented Oct 5, 2023

Should this issue then be "support other media types for structural links", with deserializers for any other formats that are out there?

No, for me the primary issue is that pystac just should not falsely try to load STAC from a HTML page and error (If that's the case). Just handle ignore/pass through such links. Additional support for more file formats would be a different issue (and a stretch goal for the longer term), I think.

Is this defined somewhere?

In OGC API - Records it's recommended, which we try to align with. I don't think it's defined anywhere, but it's also not explicitly forbidden anywhere. And I think back in the days, Chris always asked us to have HTML representations alongside JSON to allows crawling in Google etc. Thus, I think this is a very reasonable use case. (Also added a bit more context in the radiantearth/stac-spec#1259)

@gadomski gadomski changed the title get_child_links/get_item_links: Ensure correct media type? get_child_links/get_item_links: Ensure correct media type Oct 5, 2023
@m-mohr
Copy link
Contributor Author

m-mohr commented Oct 5, 2023

Regarding the implementation: What I do in STAC Browser is to check the following:

let stacTypes = ['application/geo+json', 'application/json'];
let stacItems = stac.links.filter(link => link.rel === 'items' && (!link.type || stacTypes.includes(link.type)));

That seems to work with all implementations I've encountered so far. It ensures that it has the correct media type, but also assumes that no media type in a STAC context means it's a STAC.

@aaime
Copy link

aaime commented Mar 29, 2024

In terms of it being defined anywhere, it is, the base of all OGC API specifications is called "OGC API Commons" and has a HTML requirement class.

While optional, its implementation is recommended, quoting:

Therefore, sharing data on the Web should include publication in HTML. To be consistent with the Web, this publication should be done in a way that enables users and search engines to discover and access all of the data.
This is discussed in detail in the W3C/OGC SDW Best Practice. Therefore, the OGC API — Common Standard recommends supporting HTML as an encoding.

@gadomski gadomski added this to the v1.12 milestone Sep 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants