Skip to content

Commit

Permalink
Merge remote-tracking branch 'origin/master'
Browse files Browse the repository at this point in the history
  • Loading branch information
Eisenbahnplatte committed Jun 27, 2023
2 parents 683f23e + 1ecf365 commit 85cc074
Show file tree
Hide file tree
Showing 4 changed files with 163 additions and 29 deletions.
5 changes: 3 additions & 2 deletions SUMMARY.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,8 @@
* [Usage](docs/usage/README.md)
* [JAR](docs/usage/jar.md)
* [CLI](docs/usage/cli.md)
* [Docker](docs/usage/docker.md)
* [Docker](usage/docker/README.md)
* [Docker Compose](usage/docker/docker-compose.md)
* [Scala/Java API](docs/usage/api.md)
* [Examples](docs/examples/README.md)
* [Loading geocoordinates from Wikipedia into Virtuoso (Docker)](docs/examples/exampleDocker.md)
* [Loading data into Virtuoso (Docker)](docs/examples/exampleDocker.md)
100 changes: 73 additions & 27 deletions docs/examples/exampleDocker.md
Original file line number Diff line number Diff line change
@@ -1,42 +1,88 @@
Deploy a dataset of geocoordinates of Wikipedia into a Docker SPARQL endpoint (Virtuoso).
# Loading data into Virtuoso (Docker)

Deploy a dataset into a Docker SPARQL endpoint (Virtuoso).

### Requirements

* **Docker:** `^3.5`

## Execution

### Get docker-compose.yml

get the [docker-compose.yml](../../docker/virtuoso-compose/docker-compose.yml) file of the Databus Client Repository, or create your own:

```
version: '3.5'
services:
db:
image: tenforce/virtuoso
ports:
- 8895:8890
volumes:
- toLoad:/data/toLoad
entrypoint: >
bash -c 'while [ ! -f /data/toLoad/complete ]; do sleep 1; done
&& rm -f /data/toLoad/complete && bash /virtuoso.sh'
# To change the file query: Mount an external query
# file under volumes between host and container
# and apply internal path as environment variable.
databus_client:
image: dbpedia/databus-client:latest
environment:
- SOURCE=/databus-client/query.sparql
- ENDPOINT=https://dev.databus.dbpedia.org/sparql
- COMPRESSION=gz
volumes:
- ./myQuery.sparql:/databus-client/query.sparql
- toLoad:/var/toLoad
entrypoint: >
bash -c 'bash /databus-client/entrypoint.sh
&& mv -t /var/toLoad $$(find /var/repo -name "*.gz");
touch /var/toLoad/complete'
volumes:
toLoad:
```
git clone https://github.com/dbpedia/databus-client.git
cd databus-client/docker

docker build -t vosdc -f virtuoso-image/Dockerfile virtuoso-image/
### Select your desired data

echo "PREFIX dataid: <http://dataid.dbpedia.org/ns/core#>
PREFIX dct: <http://purl.org/dc/terms/>
PREFIX dcat: <http://www.w3.org/ns/dcat#>
Again you need to specify your desired data in a sparql query

SELECT DISTINCT ?file WHERE {
?dataset dataid:version <https://databus.dbpedia.org/marvin/mappings/geo-coordinates-mappingbased/2019.09.01> .
?dataset dcat:distribution ?distribution .
?distribution dcat:downloadURL ?file .
?distribution dataid:contentVariant ?cv .
FILTER ( str(?cv) = 'de' )
}" > query.sparql
```
echo "PREFIX dcat: <http://www.w3.org/ns/dcat#>
PREFIX databus: <https://dataid.dbpedia.org/databus#>
# delete docker from previous runs
# docker rm vosdc
SELECT ?file WHERE
{
GRAPH ?g
{
?dataset databus:artifact <https://dev.databus.dbpedia.org/tester/testgroup/testartifact> .
{ ?distribution <http://purl.org/dc/terms/hasVersion> '2023-06-23' . }
?dataset dcat:distribution ?distribution .
?distribution databus:file ?file .
}
}" > myQuery.sparql
```

# start docker as deamon by adding -d
docker run --name vosdc \
-v $(pwd)/query.sparql:/opt/databus-client/query.sparql \
-v $(pwd)/data:/data \
-e SOURCE="/opt/databus-client/query.sparql" \
-p 8890:8890 \
vosdc
### Start Containers

```
docker compose up
```

Container needs some startup time and endpoint is not immediately reachable, if it is done you can query it with e.g.
Container needs some startup time and endpoint is not immediately reachable. If it is done you can query it with directly in your browser at [http://localhost:8895/sparql/](http://localhost:8895/sparql/) or you can query directly in your terminal: e.g.

```
curl --data-urlencode query="SELECT * {<http://de.dbpedia.org/resource/Karlsruhe> ?p ?o }" "http://localhost:8890/sparql"
curl --data-urlencode query="SELECT * {?a <http://xmlns.com/foaf/0.1/account> ?o }" "http://localhost:8895/sparql"
```

### Useful commands
#### Useful commands

Stopping and reseting the docker with name `databus-client`, e.g. to change the query

```
Expand All @@ -47,4 +93,4 @@ Delete pulled image

```
docker rmi -f dbpedia/databus-client
```
```
48 changes: 48 additions & 0 deletions usage/docker/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
# Docker

We also provide a dockerized version of the databus client in our [DockerHub repository](https://hub.docker.com/r/dbpedia/databus-client).

You can pass all the variables as Environment Variables (**-e**), that are shown in [#databus-client-parameters](../../docs/usage/#databus-client-parameters "mention") (except `target`), but you have to write the Environment Variables in Capital Letters.

### Requirements

* **Docker:** `^3.5`

### Installation

Pull a docker image of our [DockerHub repository](https://hub.docker.com/r/dbpedia/databus-client).

```
docker pull dbpedia/databus-client:latest
```

## Example

#### Select Data

```
echo "PREFIX dcat: <http://www.w3.org/ns/dcat#>
PREFIX databus: <https://dataid.dbpedia.org/databus#>
SELECT ?file WHERE
{
GRAPH ?g
{
?dataset databus:artifact <https://dev.databus.dbpedia.org/tester/testgroup/testartifact> .
{ ?distribution <http://purl.org/dc/terms/hasVersion> '2023-06-23' . }
?dataset dcat:distribution ?distribution .
?distribution databus:file ?file .
}
}" > myQuery.sparql
```

```
docker run --name databus-client \
-v $(pwd)/myQuery.sparql:/databus-client/query.sparql \
-v $(pwd)/repo:/var/repo \
-e ENDPOINT="https://dev.databus.dbpedia.org/sparql" \
-e SOURCE="/databus-client/query.sparql" \
-e FORMAT="ttl" \
-e COMPRESSION="bz2" \
dbpedia/databus-client
```
39 changes: 39 additions & 0 deletions usage/docker/docker-compose.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
# Docker Compose

Whether you prefer to use docker or docker compose is a matter of taste. We provide both options.

### Requirements

* **Docker:** `^3.5`

## Example

#### Installation

get the [docker-compose.yml](../../docker/databus-client-compose/docker-compose.yml) file of the Databus Client Repository

#### Select Data

```
echo "PREFIX dcat: <http://www.w3.org/ns/dcat#>
PREFIX databus: <https://dataid.dbpedia.org/databus#>
SELECT ?file WHERE
{
GRAPH ?g
{
?dataset databus:artifact <https://dev.databus.dbpedia.org/tester/testgroup/testartifact> .
{ ?distribution <http://purl.org/dc/terms/hasVersion> '2023-06-23' . }
?dataset dcat:distribution ?distribution .
?distribution databus:file ?file .
}
}" > myQuery.sparql
```

#### Execute

```
docker compose up
```

\-> the resulting files can be found in the `toLoad` Volume

0 comments on commit 85cc074

Please sign in to comment.