Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should OGC API - Joins include support for Linked Data and Semantic Web approaches? #5

Open
ghobona opened this issue Nov 18, 2022 · 8 comments
Labels
enhancement New feature or request

Comments

@ghobona
Copy link
Contributor

ghobona commented Nov 18, 2022

Considering that many users of OGC API - Joins are likely to want to associate the feature collections with data from third-parties, it might be worth looking into supporting some Linked Data approaches. For example:

  • Export in JSON-LD
  • Import in Turtle
  • Querying in SPARQL

Any thoughts are welcome. Let's aim to close the GitHub Issue at the February 2023 OGC Member Meeting.

@rob-metalinkage
Copy link

rob-metalinkage commented Nov 21, 2022

Irrespective of RDF/SPARQL, the presence of unambiguous identifiers for data elements using basic JSON-LD allows clients to actually join on common identifiers in two different datasets. I think this really minimal enabler is more important than supporting RDF as a logical model - JSON-LD allows JSON to be joined automatically - without it its guesswork or out-of-band magic configurations required to specify joins.

@lvdbrink
Copy link

I hadn't heard about OGC API Joins before and don't know how it is envisioned to work. It's hard to speculate about Linked Data/Semantic web support without this...

I can say that from Dutch experiments we concluded that it's hard to combine data from different APIs when it's not clear what the common identifiers are. At the least a common way to express identifiers is needed (some standardized URI pattern). But probably more is needed to actually get to the data (especially if it's in another API).

@ldesousa
Copy link

This API seems to be meant for the ad hoc creation of flat tables from multiple sources, potentially heavy on redundancy. This paradigm is at odds with the semantic web, that instead strives for a linked and federated data paradigm.

However, there are some benefits from adopting semantic web best practices in a this API, or any other API for data provision. URIs as unique feature and attribute identifiers, the application of well established ontologies such as GeoSPARQL, SOSA, SKOS or QUTD. I wonder if it wouldn't be more efficient to simply lay out a standard for spatial data provision with OGC services instead of figuring it out specifically for each API. GeoSPARQL and the DCAT specialisation for the spatial datasets would already go a long way.

Also keep in mind the work currently developed around the Features API with Prez and the ogcldapi profile.

@rob-metalinkage
Copy link

@lvdbrink note that adding a JSON context via a link header it is possible to map JSON payloads to URIs for the fields - to unambiguously specify whether two data values are in fact the same identifier in different contexts.

the same mapping of a namespace onto identifier tokens might be possible - yet to bottom this out

the ability to use a URI as an identifier in the actual data is easier to interpret, harder to achieve - but theoretically many more users need to interpret than data providers deliver, so it would appear to be a reasonable investment - e.g. from a FAIR perspective URI identifiers is best.

@lvdbrink
Copy link

At the least a common way to express identifiers is needed (some standardized URI pattern).
Quoting myself in order to correct - it's not so much a standardized URI pattern that's needed. But some way to recognize that something is an identifier and where/how to get the information resource that provides information on the thing being identified. Indeed, URI identifiers are the best way to do that on the web.

It's important that the identifiers are unique and persistent.

But that's not a concern for the API, rather something the data source must accomodate.

@joanma747
Copy link

At a minimum, it could be nice to have a mechanims to define the meaning of its field in the table by pointing to a URI that defines the content of each column.

@rob-metalinkage
Copy link

Having now had considerable experience mapping schemas and semantics using JSON-LD, its certainly a viable option that uses available standards, however two things make this complicated enough to require standardisation of mechanisms.

Firstly, the json-ld context needs to reflect the schema structure, in practice this means tools to bundle contexts using schema fragment mappings. It's too hard to do this manually and tools aren't good enough to help debug.

Secondly many structural elements have their own semantics, and if these don't exist in the target ontology then an additional ontology needs to be defined. We can define transformations once we get rdf from the json-ld, but can't always map directly to the intended semantics. E.g. no mapping from geojson geometry to Geosparql equivalents.

The OGC building blocks handle all these concerns in a flexible way, and a library of mappings for OGC API components is being developed

@ghobona ghobona added the enhancement New feature or request label Sep 24, 2024
@ghobona
Copy link
Contributor Author

ghobona commented Sep 24, 2024

2024-09-24 SWG discussed this and agreed that this could be an enhancement in a future version of the standard.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

5 participants