Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Other: Also use persistent RDFlib store for output graphs #61

Open
zmaas opened this issue Oct 9, 2020 · 1 comment
Open

Other: Also use persistent RDFlib store for output graphs #61

zmaas opened this issue Oct 9, 2020 · 1 comment
Assignees
Labels
other release v3.0.0 noting work and issues related to release v3.0.0

Comments

@zmaas
Copy link

zmaas commented Oct 9, 2020

Once a graph has been built, it may be useful to also import the resulting .owl file into an RDFlib persistent store. Use of a persistent store allows for the graph to be accessed using RDFlib without having to import the entire structure into memory, which may be advantageous when working with large graphs. Below is a sample implementation that uses the Berkeley Database as a persistent backend. RDFlib has built-in support for this particular backend. Note that Berkeley DB was formerly developed by Sleepycat Software, hence the use of "Sleepycat" as the backend name when creating the Graph object.

import rdflib
# The persistent store requires an identifier
graph_id = rdflib.URIRef(identifier)
# Open the graph with the "Sleepycat" Berkeley DB Backend
graph = rdflib.Graph("Sleepycat", identifier=graph_id)
# Open the graph and create it if it doesn't exist
graph.open(uri, create=True)
# Parse the graph at 'graph_path', typically XML formatted
# This could take many hours if the graph is large
graph.parse(graph_path)
# Close the graph to free resources. Mostly unneccessary due
# to the small overhead of the on-disk store
graph.close()

Alternatively, the following code wraps the above functionality in a context manager, allowing the graph to be managed inside of a with block for convenience:

from contextlib import contextmanager
import rdflib


@contextmanager
def open_persistent_graph(uri, identifier, graph_path=None):
    """Provides a context manager for working with an OWL graph while also
    automatically closing it afterward. URI is the location of the
    graph store directory and IDENTIFIER is the name of the graph
    within that store. Optional argument GRAPH_PATH specifies an
    appropriately formatted RDF file to import when opening the graph.

    """
    try:
        # Only force create if a path is provided
        create_graph = bool(graph_path)
        # Open and load the on-disk store
        graph_id = rdflib.URIRef(identifier)
        graph = rdflib.Graph("Sleepycat", identifier=graph_id)
        graph.open(uri, create=create_graph)
        # Parse the file at GRAPH_PATH if set
        if graph_path:
            graph.parse(graph_path)
        yield graph
    finally:
        graph.close()
@callahantiff
Copy link
Owner

Thanks so much @zmaas! This is great. I will plan to leave this issue active until we can address it during the rebuild next month. Assuming it's OK with you, I will circle back to you when we are in the re-implementation stage?

@callahantiff callahantiff added the release v3.0.0 noting work and issues related to release v3.0.0 label Oct 17, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
other release v3.0.0 noting work and issues related to release v3.0.0
Projects
None yet
Development

No branches or pull requests

2 participants