Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update README.md #339

Merged
merged 2 commits into from
Jul 31, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
158 changes: 91 additions & 67 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,21 +8,28 @@

# Archivista

Archivista is a graph and storage service for [in-toto](https://in-toto.io) attestations. Archivista enables the discovery
and retrieval of attestations for software artifacts.
Archivista is a graph and storage service for [in-toto](https://in-toto.io)
attestations. Archivista enables the discovery and retrieval of attestations for
software artifacts.

## Archivista enables you to:
## Archivista enables you to

- Store and retrieve in-toto attestations
- Query for relationships between attestations via a GraphQL API
- Validate Witness policy without the need to manually list expected attestations
- Validate Witness policy without the need to manually list expected
attestations

## Archivista is a trusted store for supply chain metadata

- It creates a graph of supply chain metadata while storing attestations that can be later used for policy validation and flexible querying.
- It is designed to be horizontally scaleable, supporting storing a large number of attestations.
- It supports deployment on major cloud service and infrastructure providers, making it a versatile and flexible solution for securing software supply chains.
- It only stores signed attestations to further enhance security and and increase trust.
- It creates a graph of supply chain metadata while storing attestations that
can be later used for policy validation and flexible querying.
- It is designed to be horizontally scaleable, supporting storing a large number
of attestations.
- It supports deployment on major cloud service and infrastructure providers,
making it a versatile and flexible solution for securing software supply
chains.
- It only stores signed attestations to further enhance security and and
increase trust.

## Key Features

Expand All @@ -36,88 +43,104 @@ and retrieval of attestations for software artifacts.

## How Archivista Works

When an attestation is uploaded to Archivista it will store the entire attestation in a configured object store as well
as scrape some data from the attestation and store it in a queryable metadata store. This metadata is exposed through a
GraphQL API. This enables queries such as finding all attestations related to an artifact with a specified hash or
finding all attestations that recorded the use of a specific dependency.

Archivista uses Subjects on the [in-toto
Statement](https://github.com/in-toto/attestation/blob/main/spec/README.md#statement) as edges on this graph. Producers
of attestations (such as [Witness](https://github.com/in-toto/witness) can use these subjects as a way to expose
relationships between attestations.

For example when attesting that an artifact was compiled the compiled artifact may be a subject, as well as the git
commit hash the artifact was built from. This would allow traversing the graph by the commit hash to find other relevant
attestations such as those describing code reviews, testing, and scanning that happened on that git commit.
When an attestation is uploaded to Archivista it will store the entire
attestation in a configured object store as well as scrape some data from the
attestation and store it in a queryable metadata store. This metadata is exposed
through a GraphQL API. This enables queries such as finding all attestations
related to an artifact with a specified hash or finding all attestations that
recorded the use of a specific dependency.

Archivista uses Subjects on the
[in-toto Statement](https://github.com/in-toto/attestation/blob/main/spec/README.md#statement)
as edges on this graph. Producers of attestations (such as
[Witness](https://github.com/in-toto/witness) can use these subjects as a way to
expose relationships between attestations.

For example when attesting that an artifact was compiled the compiled artifact
may be a subject, as well as the git commit hash the artifact was built from.
This would allow traversing the graph by the commit hash to find other relevant
attestations such as those describing code reviews, testing, and scanning that
happened on that git commit.

## Running Archivista

A public instance of Archivista is running [here](https://archivista.testifysec.io) for testing purposes. The data in this
instance is open to the world and there are currently no SLAs defined for this instance.
A public instance of Archivista is running
[here](https://archivista.testifysec.io) for testing purposes. The data in this
instance is open to the world and there are currently no SLAs defined for this
instance.

Archivista requires a MySQL database as well as a compatible file store. Compatible file stores include a local directory
or any S3 compatible store.
Archivista requires a MySQL database as well as a compatible file store.
Compatible file stores include a local directory or any S3 compatible store.

A docker compose file is included in the repository that will run a local instance of Archivista along with the necessary
services for it to operate. These include Minio and MySQL. Simply cloning the repo and running
A docker compose file is included in the repository that will run a local
instance of Archivista along with the necessary services for it to operate.
These include Minio and MySQL. Simply cloning the repo and running

```
```bash
docker compose up --build -d
```

is enough to get a local instance of Archivista up and running. Archivista will be listening at `http://localhost:8082` by
default with this docker compose file.
is enough to get a local instance of Archivista up and running. Archivista will
be listening at `http://localhost:8082` by default with this docker compose
file.

### Configuration

Archivista is configured through environment variables currently.

| Variable | Default Value | Description |
|--------------------------------------------|------------------------------|-----------------------------------------------------------------------------------------------|
| ARCHIVISTA_LISTEN_ON | tcp://127.0.0.1:8082 | URL endpoint for Archivista to listen on |
| ARCHIVISTA_LOG_LEVEL | INFO | Log level. Options are DEBUG, INFO, WARN, ERROR |
| ARCHIVISTA_CORS_ALLOW_ORIGINS | | Comma separated list of origins to allow CORS requests from |
| ARCHIVISTA_SQL_STORE_CONNECTION_STRING | root:example@tcp(db)/testify | SQL store connection string |
| ARCHIVISTA_STORAGE_BACKEND | | Backend to use for attestation storage. Options are FILE, BLOB, or empty string for disabled. |
| ARCHIVISTA_FILE_SERVE_ON | | What address to serve files on. Only valid when using FILE storage backend. |
| ARCHIVISTA_FILE_DIR | /tmp/archivista/ | Directory to store and serve files. Only valid when using FILE storage backend. |
| ARCHIVISTA_BLOB_STORE_ENDPOINT | 127.0.0.1:9000 | URL endpoint for blob storage. Only valid when using BLOB storage backend. |
| ARCHIVISTA_BLOB_STORE_CREDENTIAL_TYPE | | Blob store credential type. Options are IAM or ACCESS_KEY. |
| ARCHIVISTA_BLOB_STORE_ACCESS_KEY_ID | | Blob store access key id. Only valid when using BLOB storage backend. |
| ARCHIVISTA_BLOB_STORE_SECRET_ACCESS_KEY_ID | | Blob store secret access key id. Only valid when using BLOB storage backend. |
| ARCHIVISTA_BLOB_STORE_USE_TLS | TRUE | Use TLS for BLOB storage backend. Only valid when using BLOB storage backend. |
| ARCHIVISTA_BLOB_STORE_BUCKET_NAME | | Bucket to use for storage. Only valid when using BLOB storage backend. |
| ARCHIVISTA_ENABLE_GRAPHQL | TRUE | Enable GraphQL Endpoint |
| ARCHIVISTA_GRAPHQL_WEB_CLIENT_ENABLE | TRUE | Enable GraphiQL, the GraphQL web client |
| ARCHIVISTA_ENABLE_ARTIFACT_STORE | FALSE | Enable Artifact Store Endpoints |
| ARCHIVISTA_ARTIFACT_STORE_CONFIG | /tmp/artifacts/config.yaml | Location of the config describing available artifacts |

| Variable | Default Value | Description |
| ------------------------------------------ | ----------------------------------------- | --------------------------------------------------------------------------------------------- |
| ARCHIVISTA_LISTEN_ON | tcp://127.0.0.1:8082 | URL endpoint for Archivista to listen on |
| ARCHIVISTA_LOG_LEVEL | INFO | Log level. Options are DEBUG, INFO, WARN, ERROR |
| ARCHIVISTA_CORS_ALLOW_ORIGINS | | Comma separated list of origins to allow CORS requests from |
| ARCHIVISTA_SQL_STORE_BACKEND | | Backend to use for SQL. Options are MYSQL or PSQL |
| ARCHIVISTA_SQL_STORE_CONNECTION_STRING | postgresql://root:example@tcp(db)/testify | SQL store connection string |
| ARCHIVISTA_STORAGE_BACKEND | | Backend to use for attestation storage. Options are FILE, BLOB, or empty string for disabled. |
| ARCHIVISTA_FILE_SERVE_ON | | What address to serve files on. Only valid when using FILE storage backend (e.g. `:8081`). |
| ARCHIVISTA_FILE_DIR | /tmp/archivista/ | Directory to store and serve files. Only valid when using FILE storage backend. |
| ARCHIVISTA_BLOB_STORE_ENDPOINT | 127.0.0.1:9000 | URL endpoint for blob storage. Only valid when using BLOB storage backend. |
| ARCHIVISTA_BLOB_STORE_CREDENTIAL_TYPE | | Blob store credential type. Options are IAM or ACCESS_KEY. |
| ARCHIVISTA_BLOB_STORE_ACCESS_KEY_ID | | Blob store access key id. Only valid when using BLOB storage backend. |
| ARCHIVISTA_BLOB_STORE_SECRET_ACCESS_KEY_ID | | Blob store secret access key id. Only valid when using BLOB storage backend. |
| ARCHIVISTA_BLOB_STORE_USE_TLS | TRUE | Use TLS for BLOB storage backend. Only valid when using BLOB storage backend. |
| ARCHIVISTA_BLOB_STORE_BUCKET_NAME | | Bucket to use for storage. Only valid when using BLOB storage backend. |
| ARCHIVISTA_ENABLE_GRAPHQL | TRUE | Enable GraphQL Endpoint |
| ARCHIVISTA_GRAPHQL_WEB_CLIENT_ENABLE | TRUE | Enable GraphiQL, the GraphQL web client |
| ARCHIVISTA_ENABLE_ARTIFACT_STORE | FALSE | Enable Artifact Store Endpoints |
| ARCHIVISTA_ARTIFACT_STORE_CONFIG | /tmp/artifacts/config.yaml | Location of the config describing available artifacts |

## Using Archivista

Archivista exposes two HTTP endpoints to upload or download attestations:

```
```http
POST /upload - Uploads an attestation to Archivista. The attestation is to be in the request's body
```

```
```http
GET /download/:gitoid: - Downloads an attestation with provided gitoid from Archivista
```

Additionally Archivista exposes a GraphQL API. By default the GraphQL playground is enabled and available at root.
Additionally Archivista exposes a GraphQL API. By default the GraphQL playground
is enabled and available at root.

`archivistactl` is a CLI tool in this repository that is available to interact with an Archivista instance. `archivistactl`
is capable of uploading and downloading attestations as well as doing some basic queries such as finding all
attestations with a specified subject and retrieving all subjects for a specified attestation.
`archivistactl` is a CLI tool in this repository that is available to interact
with an Archivista instance. `archivistactl` is capable of uploading and
downloading attestations as well as doing some basic queries such as finding all
attestations with a specified subject and retrieving all subjects for a
specified attestation.

## Navigating the Graph

As previously mentioned, Archivista offers a GraphQL API that enables users to discover attestations. When Archivista ingests
an attestation some metadata will be stored into the SQL metadata store. This metadata is exposed through the GraphQL API.
Archivista uses [Relay connections](https://relay.dev/graphql/connections.htm) for querying and pagination.
As previously mentioned, Archivista offers a GraphQL API that enables users to
discover attestations. When Archivista ingests an attestation some metadata will
be stored into the SQL metadata store. This metadata is exposed through the
GraphQL API. Archivista uses
[Relay connections](https://relay.dev/graphql/connections.htm) for querying and
pagination.

Here is an entity relationship diagram of the metadata that is currently available.
Here is an entity relationship diagram of the metadata that is currently
available.

```mermaid
erDiagram
Expand Down Expand Up @@ -174,16 +197,17 @@ timestamp {

## Deployment

Archivista can be easily deployed thru the provided helm chart into your kubernetes
cluster. See the [README](chart/README.md) for more details.
Archivista can be easily deployed thru the provided helm chart into your
kubernetes cluster. See the [README](chart/README.md) for more details.

## What's Next

We would like to expand the types of data Archivista can ingest as well as expand the metadata Archivista collected about
ingested data. If you have ideas or use cases for Archivista, feel free to [contact us](mailto:[email protected]) or
create an issue!

We would like to expand the types of data Archivista can ingest as well as
expand the metadata Archivista collected about ingested data. If you have ideas
or use cases for Archivista, feel free to
[contact us](mailto:[email protected]) or create an issue!

## Contributing

See [CONTRIBUTING.md](CONTRIBUTING.md) for information on how to contribute to Archivista.
See [CONTRIBUTING.md](CONTRIBUTING.md) for information on how to contribute to
Archivista.
Loading