Skip to content

Commit

Permalink
Operations: Add tooling/docs/monitoring for thanos bucket inspect/rep…
Browse files Browse the repository at this point in the history
…licate (#572)

* operations: Add template to configure Thanos S3 secret from params

* operations: Add template to run bucket inspect tool as a Job

* operations: Add template to run bucket inspect tool as a CronJob

* operations: Add docs for bucket inspect tool

* operations: Add template to run bucket replicate tool as a Job

* operations: Add PodMonitor template for replicate Job

* operations: Add docs for bucket replicate tool
  • Loading branch information
philipgough authored Aug 2, 2023
1 parent 469a134 commit 0caac3b
Show file tree
Hide file tree
Showing 8 changed files with 417 additions and 0 deletions.
31 changes: 31 additions & 0 deletions resources/operations/bucket-inspect/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
# What

This template deploys [Thanos Bucket Inspect](https://thanos.io/tip/components/tools.md/#bucket-insepct)
as a Kubernetes Job or CronJob.

# SOP

Create a Kubernetes Secret that contains the credentials for the target object storage provider, or use the
template provided in this directory for S3 compatible object storage providers.

```yaml
apiVersion: v1
kind: Secret
metadata:
name: thanos-bucket-inspect-config
type: Opaque
stringData:
from-config.yaml: |
# see https://thanos.io/tip/thanos/storage.md/
```
Process the template and run the Job
```bash
oc process -f job-template.yaml | oc apply -f -
```

Alternatively, you can run it as a CronJob
```bash
oc process -f cron-job-template.yaml | oc apply -f -
```
62 changes: 62 additions & 0 deletions resources/operations/bucket-inspect/cron-job-template.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
apiVersion: template.openshift.io/v1
kind: Template
metadata:
name: rhobs-thanos-bucket-inspect-cron
labels:
app.kubernetes.io/name: thanos-bucket-inspect
app.kubernetes.io/part-of: observatorium
description: |
Inspect data in an object storage provider bucket on a schedule
parameters:
- name: NAME
description: The name of the CronJob.
value: 'thanos-bucket-inspect'
- name: NAMESPACE
description: The namespace where the Job should run.
value: 'observatorium-operations'
- name: OBJ_STORE_CONFIG_SECRET_NAME
value: 'thanos-bucket-inspect-config'
- name: SCHEDULE
description: The schedule for the Job to run. Defaults to every 12 hours.
value: '0 */12 * * *'
- name: TENANT_ID
value: 'rhobs'
- name: IMAGE_TAG
value: 'v0.31.0'
- name: LOG_LEVEL
value: 'info'
objects:
- apiVersion: batch/v1
kind: CronJob
metadata:
name: ${NAME}
namespace: ${NAMESPACE}
labels:
app.kubernetes.io/name: thanos-bucket-inspect
app.kubernetes.io/part-of: observatorium
spec:
schedule: ${SCHEDULE}
jobTemplate:
spec:
template:
spec:
containers:
- name: thanos-bucket-inspect
image: quay.io/thanos/thanos:${IMAGE_TAG}
volumeMounts:
- name: obj-store-config
readOnly: true
mountPath: "/var/lib/thanos/bucket-inspect-config"
args:
- 'tools'
- 'bucket'
- 'inspect'
- '--log.level=${LOG_LEVEL}'
- '--objstore.config-file=/var/lib/thanos/bucket-inspect-config/config.yaml'
- '--selector=tenant_id="${TENANT_ID}"'
restartPolicy: Never
volumes:
- name: obj-store-config
secret:
secretName: ${OBJ_STORE_CONFIG_SECRET_NAME}

57 changes: 57 additions & 0 deletions resources/operations/bucket-inspect/job-template.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
apiVersion: template.openshift.io/v1
kind: Template
metadata:
name: rhobs-thanos-bucket-inspect
labels:
app.kubernetes.io/name: thanos-bucket-inspect
app.kubernetes.io/part-of: observatorium
description: |
Inspect data in an object storage provider bucket
parameters:
- name: NAME
description: The name of the Job.
value: 'thanos-bucket-inspect'
- name: NAMESPACE
description: The namespace where the Job should run.
value: 'observatorium-operations'
- name: OBJ_STORE_CONFIG_SECRET_NAME
value: 'thanos-bucket-inspect-config'
- name: TENANT_ID
value: 'rhobs'
- name: IMAGE_TAG
value: 'v0.31.0'
- name: LOG_LEVEL
value: 'info'
objects:
- apiVersion: batch/v1
kind: Job
metadata:
name: ${NAME}
namespace: ${NAMESPACE}
labels:
app.kubernetes.io/name: thanos-bucket-inspect
app.kubernetes.io/part-of: observatorium
spec:
backoffLimit: 4
template:
spec:
containers:
- name: thanos-bucket-inspect
image: quay.io/thanos/thanos:${IMAGE_TAG}
volumeMounts:
- name: obj-store-config
readOnly: true
mountPath: "/var/lib/thanos/bucket-inspect-config"
args:
- 'tools'
- 'bucket'
- 'inspect'
- '--log.level=${LOG_LEVEL}'
- '--objstore.config-file=/var/lib/thanos/bucket-inspect-config/config.yaml'
- '--selector=tenant_id="${TENANT_ID}"'
restartPolicy: Never
volumes:
- name: obj-store-config
secret:
secretName: ${OBJ_STORE_CONFIG_SECRET_NAME}

43 changes: 43 additions & 0 deletions resources/operations/bucket-inspect/s3-secret-template.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
apiVersion: template.openshift.io/v1
kind: Template
metadata:
name: Thanos Bucket Inspect
labels:
app.kubernetes.io/name: thanos-bucket-inspect
app.kubernetes.io/part-of: observatorium
description: |
This template creates a Secret that supports Thanos Object Storage inspection for S3.
parameters:
- name: NAMESPACE
description: The namespace where the Secret will be created.
value: 'observatorium-operations'
- name: OBJ_STORE_CONFIG_SECRET_NAME
value: 'thanos-bucket-inspect-config'
- name: ACCESS_KEY_ID
- name: SECRET_ACCESS_KEY
- name: S3_BUCKET_NAME
- name: S3_BUCKET_ENDPOINT
value: s3.us-east-1.amazonaws.com
- name: S3_BUCKET_REGION
value: us-east-1
objects:
- apiVersion: v1
kind: Secret
metadata:
name: ${OBJ_STORE_CONFIG_SECRET_NAME}
namespace: ${NAMESPACE}
labels:
app.kubernetes.io/name: thanos-bucket-inspect
app.kubernetes.io/part-of: observatorium
type: Opaque
stringData:
config.yaml: |
type: S3
config:
bucket: ${S3_BUCKET_NAME}
region: ${S3_BUCKET_REGION}
access_key: ${ACCESS_KEY_ID}
secret_key: ${SECRET_ACCESS_KEY}
endpoint: ${S3_BUCKET_ENDPOINT}
51 changes: 51 additions & 0 deletions resources/operations/bucket-replicate/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
# What

This template deploys [Thanos Bucket Inspect](https://thanos.io/tip/components/tools.md/#bucket-insepct)
as a Kubernetes Job.

# SOP

> **_NOTE:_** Before running this Job, if you wish to track progress via logs,
you can run the [Thanos Bucket Inspect](../bucket-inspect/README.md#sop)
Job against the source and the CronJob against the destination to make sure that the source and destination
are in sync.
Logs are extra useful if you don't have access to the Prometheus metrics or the Job will complete before a scrape.

Create a Kubernetes Secret that contains the credentials for both the target and destination object storage
provider or use the template provided in this directory for S3 compatible object storage providers.


```yaml
---
apiVersion: v1
kind: Secret
metadata:
name: thanos-bucket-replicate-source-config
type: Opaque
stringData:
config.yaml: |
# see https://thanos.io/tip/thanos/storage.md/
---
apiVersion: v1
kind: Secret
metadata:
name: thanos-bucket-replicate-destination-config
type: Opaque
stringData:
config.yaml: |
# see https://thanos.io/tip/thanos/storage.md/
```
Optionally create the PodMonitor to scrape Prometheus metrics from the Job
```bash
oc process -f monitoring-template.yaml | oc apply -f -
```

Process the template and run the Job

```bash
oc process -f job-template.yaml | oc apply -f -
```


97 changes: 97 additions & 0 deletions resources/operations/bucket-replicate/job-template.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,97 @@
apiVersion: template.openshift.io/v1
kind: Template
metadata:
name: rhobs-thanos-bucket-replicate
labels:
app.kubernetes.io/name: thanos-bucket-replicate
app.kubernetes.io/part-of: observatorium
description: |
Replicate data between object storage provider buckets
parameters:
- name: NAME
description: The name of the Job.
value: 'thanos-bucket-replicate'
- name: NAMESPACE
description: The namespace where the Job should run.
value: 'observatorium-operations'
- name: SOURCE_OBJ_STORE_CONFIG_SECRET_NAME
value: 'thanos-bucket-replicate-config-source'
- name: DESTINATION_OBJ_STORE_CONFIG_SECRET_NAME
value: 'thanos-bucket-replicate-config-destination'
- name: COMPACTION_MIN
value: '0'
- name: COMPACTION_MAX
value: '100'
- name: MIN_TIME
value: '0000-01-01T00:00:00Z'
- name: MAX_TIME
value: '9999-12-31T23:59:59Z'
- name: TENANT_ID
value: 'rhobs'
- name: IMAGE_TAG
value: 'main-2023-08-01-e1a3ec1'
- name: LOG_LEVEL
value: 'info'
- name: CPU_REQUEST
value: '1'
- name: CPU_LIMIT
value: '2'
- name: MEMORY_REQUEST
value: '500Mi'
- name: MEMORY_LIMIT
value: '1Gi'
objects:
- apiVersion: batch/v1
kind: Job
metadata:
name: ${NAME}
namespace: ${NAMESPACE}
labels:
app.kubernetes.io/name: thanos-bucket-replicate
app.kubernetes.io/part-of: observatorium
spec:
parallelism: 1
backoffLimit: 1
template:
spec:
containers:
- name: thanos-bucket-replicate
image: quay.io/thanos/thanos:${IMAGE_TAG}
resources:
requests:
memory: ${MEMORY_REQUEST}
cpu: ${CPU_REQUEST}
limits:
memory: ${MEMORY_LIMIT}
cpu: ${CPU_LIMIT}
ports:
- containerPort: 10902
name: metrics
volumeMounts:
- name: obj-store-from-config
readOnly: true
mountPath: "/var/lib/thanos/bucket-replicate-config/from"
- name: obj-store-to-config
readOnly: true
mountPath: "/var/lib/thanos/bucket-replicate-config/to"
args:
- 'tools'
- 'bucket'
- 'replicate'
- '--log.level=${LOG_LEVEL}'
- '--objstore.config-file=/var/lib/thanos/bucket-replicate-config/from/config.yaml'
- '--objstore-to.config-file=/var/lib/thanos/bucket-replicate-config/to/config.yaml'
- '--single-run'
- '--matcher=tenant_id="${TENANT_ID}"'
- '--min-time=${MIN_TIME}'
- '--max-time=${MAX_TIME}'
- '--compaction-min=${COMPACTION_MIN}'
- '--compaction-max=${COMPACTION_MAX}'
restartPolicy: Never
volumes:
- name: obj-store-from-config
secret:
secretName: ${SOURCE_OBJ_STORE_CONFIG_SECRET_NAME}
- name: obj-store-to-config
secret:
secretName: ${DESTINATION_OBJ_STORE_CONFIG_SECRET_NAME}
33 changes: 33 additions & 0 deletions resources/operations/bucket-replicate/monitoring-template.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
apiVersion: template.openshift.io/v1
kind: Template
metadata:
name: rhobs-thanos-bucket-replicate-pod-monitor
labels:
app.kubernetes.io/name: thanos-bucket-replicate
app.kubernetes.io/part-of: observatorium
parameters:
- name: NAMESPACE
description: The namespace where the running Job will reside.
value: 'observatorium-operations'
- name: NAME
description: The name of the Job.
value: 'thanos-bucket-replicate'
objects:
- apiVersion: monitoring.coreos.com/v1
kind: PodMonitor
metadata:
name: observatorium-operations-thanos-bucket-replicate
labels:
prometheus: app-sre
spec:
namespaceSelector:
matchNames:
- ${NAMESPACE}
selector:
matchLabels:
job-name: ${NAME}
podMetricsEndpoints:
- port: metrics
interval: 30s
path: /metrics

Loading

0 comments on commit 0caac3b

Please sign in to comment.