Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add pvcCleaner for events #3634

Closed
wants to merge 3 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .changelog/3634.added.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
feat: add pvcCleaner for events
1 change: 1 addition & 0 deletions deploy/helm/sumologic/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -482,6 +482,7 @@ The following table lists the configurable parameters of the Sumo Logic chart an
| `tailing-sidecar-operator.scc.create` | Create OpenShift's Security Context Constraint | `false` |
| `pvcCleaner.metrics.enabled` | Flag to enable cleaning unused PVCs for otelcol metrics statefulsets. | `false` |
| `pvcCleaner.logs.enabled` | Flag to enable cleaning unused PVCs for otelcol logs statefulsets. | `false` |
| `pvcCleaner.events.enabled` | Flag to enable cleaning unused PVCs for otelcol events statefulsets. | `false` |
| `pvcCleaner.job.image.repository` | Image repository for pvcCleaner docker containers. | `public.ecr.aws/sumologic/kubernetes-tools-kubectl` |
| `pvcCleaner.job.image.tag` | Image tag for pvcCleaner docker containers. | `2.22.0` |
| `pvcCleaner.job.image.pullPolicy` | Image pullPolicy for pvcCleaner docker containers. | `IfNotPresent` |
Expand Down
10 changes: 9 additions & 1 deletion deploy/helm/sumologic/templates/_helpers/_events.tpl
Original file line number Diff line number Diff line change
Expand Up @@ -138,4 +138,12 @@ otlp
{{- else -}}
{{- template "kubernetes.defaultAffinity" . -}}
{{- end -}}
{{- end -}}
{{- end -}}

{{- define "sumologic.metadata.name.pvcCleaner.events" -}}
{{- template "sumologic.metadata.name.pvcCleaner" . }}-events
{{- end -}}

{{- define "sumologic.labels.app.pvcCleaner.events" -}}
{{- template "sumologic.labels.app.pvcCleaner" . }}-events
{{- end -}}
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
{{- if or (eq .Values.pvcCleaner.logs.enabled true) (eq .Values.pvcCleaner.metrics.enabled true) }}
{{- if or (eq .Values.pvcCleaner.logs.enabled true) (eq .Values.pvcCleaner.metrics.enabled true) (eq .Values.pvcCleaner.events.enabled true) }}
apiVersion: v1
kind: ConfigMap
metadata:
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
{{- if eq .Values.pvcCleaner.events.enabled true }}
apiVersion: batch/v1
kind: CronJob
metadata:
name: {{ template "sumologic.metadata.name.pvcCleaner.events" . }}
namespace: {{ template "sumologic.namespace" . }}
labels:
app: {{ template "sumologic.labels.app.pvcCleaner.events" . }}
{{- include "sumologic.labels.common" . | nindent 4 }}
spec:
schedule: {{ .Values.pvcCleaner.job.schedule | quote }}
jobTemplate:
spec:
template:
metadata:
name: {{ template "sumologic.metadata.name.pvcCleaner.events" . }}
labels:
app: {{ template "sumologic.labels.app.pvcCleaner.events" . }}
{{- include "sumologic.labels.common" . | nindent 12 }}
{{- with .Values.sumologic.podLabels }}
{{ toYaml . | indent 12 }}
{{- end }}
{{- with .Values.pvcCleaner.job.podLabels }}
{{ toYaml . | indent 12 }}
{{- end }}
annotations:
{{- with .Values.sumologic.podAnnotations }}
{{ toYaml . | indent 12 }}
{{- end }}
{{- with .Values.pvcCleaner.job.podAnnotations }}
{{ toYaml . | indent 12 }}
{{- end }}
spec:
nodeSelector:
{{- if not (empty (include "pvcCleaner.job.nodeSelector" .)) }}
{{ include "pvcCleaner.job.nodeSelector" . | indent 12 }}
{{- end }}
{{- if not (empty (include "pvcCleaner.job.tolerations" .)) }}
tolerations:
{{ include "pvcCleaner.job.tolerations" . | indent 12 }}
{{- end }}
{{- if not (empty (include "pvcCleaner.job.affinity" .)) }}
affinity:
{{ include "pvcCleaner.job.affinity" . | indent 12 }}
{{- end }}
{{- with .Values.pvcCleaner.job.securityContext }}
securityContext:
{{ toYaml . | indent 12 }}
{{- end }}
containers:
- name: pvc-cleaner
image: {{ .Values.pvcCleaner.job.image.repository }}:{{ .Values.pvcCleaner.job.image.tag }}
command:
- "bash"
- "/pvc-cleaner/pvc-cleaner.sh"
- "{{ template "sumologic.namespace" . }}"
- "app={{ template "sumologic.labels.app.events.statefulset" . }}"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like adding a whole separate CronJob manifest just to change this one parameter. Can we use a tpl function instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually events differs more, as they do not have hpa

diff --git a/deploy/helm/sumologic/templates/pvc-cleaner/cron-job-metrics.yaml b/deploy/helm/sumologic/templates/pvc-cleaner/cron-job-events.yaml
index 4b37d360..6e929967 100644
--- a/deploy/helm/sumologic/templates/pvc-cleaner/cron-job-metrics.yaml
+++ b/deploy/helm/sumologic/templates/pvc-cleaner/cron-job-events.yaml
@@ -1,11 +1,11 @@
-{{- if eq .Values.pvcCleaner.metrics.enabled true }}
+{{- if eq .Values.pvcCleaner.events.enabled true }}
 apiVersion: batch/v1
 kind: CronJob
 metadata:
-  name: {{ template "sumologic.metadata.name.pvcCleaner.metrics" . }}
+  name: {{ template "sumologic.metadata.name.pvcCleaner.events" . }}
   namespace: {{ template "sumologic.namespace"  . }}
   labels:
-    app: {{ template "sumologic.labels.app.pvcCleaner.metrics" . }}
+    app: {{ template "sumologic.labels.app.pvcCleaner.events" . }}
     {{- include "sumologic.labels.common" . | nindent 4 }}
 spec:
   schedule: {{ .Values.pvcCleaner.job.schedule | quote }}
@@ -13,9 +13,9 @@ spec:
     spec:
       template:
         metadata:
-          name: {{ template "sumologic.metadata.name.pvcCleaner.metrics" . }}
+          name: {{ template "sumologic.metadata.name.pvcCleaner.events" . }}
           labels:
-            app: {{ template "sumologic.labels.app.pvcCleaner.metrics" . }}
+            app: {{ template "sumologic.labels.app.pvcCleaner.events" . }}
 {{- include "sumologic.labels.common" . | nindent 12 }}
 {{- with .Values.sumologic.podLabels }}
 {{ toYaml . | indent 12 }}
@@ -54,8 +54,7 @@ spec:
             - "bash"
             - "/pvc-cleaner/pvc-cleaner.sh"
             - "{{ template "sumologic.namespace" . }}"
-            - "app={{ template "sumologic.labels.app.metrics.statefulset" . }}"
-            - "{{ template "sumologic.metadata.name.metrics.hpa" . }}"
+            - "app={{ template "sumologic.labels.app.events.statefulset" . }}"
             imagePullPolicy: {{ .Values.pvcCleaner.job.image.pullPolicy }}
             resources:
               {{- toYaml .Values.pvcCleaner.job.resources | nindent 14 }}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth mentioning that we already have separate manifests for metrics and logs: https://github.com/SumoLogic/sumologic-kubernetes-collection/tree/main/deploy/helm/sumologic/templates/pvc-cleaner

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know, and I think 3 copies is enough to start making them DRY. It'd be best if we could have just one, with different parameters depending on config, but I'm not sure how realistic that is.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can create issue for that. Right now I'm on customer support, so I would like to have it done rather quickly with low effort

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If that happens, the Pod will fail to start in the new AZ, because the PVC name is the same, so the previous PVC would need to be manually deleted anyway.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is what pvcCleaner is for. To remove this pvc, which cannot be attached to the pod
\

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pvcCleaner was originally for cleaning up unused PVCs after metadata downscaling. It was definitely not for fixing AZ-related volume mounting problems. And it doesn't run frequently enough by default to be able to do this effectively anyway.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I disagree, the original problem was AZ problem AFAIR, but it was result of non-cleaning unused PVCs. I think we can discuss it internally 😅

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Outcome after tests is that this PR doesn't resolve the issue. Closed in favor of #3639

imagePullPolicy: {{ .Values.pvcCleaner.job.image.pullPolicy }}
resources:
{{- toYaml .Values.pvcCleaner.job.resources | nindent 14 }}
volumeMounts:
- name: pvc-cleaner
mountPath: /pvc-cleaner
volumes:
- configMap:
defaultMode: 420
name: {{ template "sumologic.metadata.name.pvcCleaner.configmap" . }}
name: pvc-cleaner
restartPolicy: Never
serviceAccountName: {{ template "sumologic.metadata.name.pvcCleaner.roles.serviceaccount" . }}
{{- end }}
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
{{- if or (eq .Values.pvcCleaner.logs.enabled true) (eq .Values.pvcCleaner.metrics.enabled true) }}
{{- if or (eq .Values.pvcCleaner.logs.enabled true) (eq .Values.pvcCleaner.metrics.enabled true) (eq .Values.pvcCleaner.events.enabled true) }}
apiVersion: v1
kind: ServiceAccount
metadata:
Expand Down
2 changes: 2 additions & 0 deletions deploy/helm/sumologic/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2363,6 +2363,8 @@ pvcCleaner:
enabled: false
logs:
enabled: false
events:
enabled: false

job:
image:
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
sumologic:
podLabels:
someSumo: label
podAnnotations:
someSumo: annotation
nodeSelector:
notMy: node
tolerations:
- key: null
operator: NotExists
effect: "TestFail"
affinity:
nodeAffinity:
requiredSomethingDuringSomethingElse:
nodeSelectorTerms:
- matchExpressions:
- key: definitely_not
operator: In
values:
- a-correct-affinity

pvcCleaner:
events:
enabled: true
job:
image:
repository: private.ecr.aws/sumologic/kubernetes-tools
tag: x.y.z
pullPolicy: Always
pullSecrets:
- name: myRegistryKeySecretName
resources:
limits:
memory: 1025Mi
cpu: 31m
requests:
memory: 63Mi
cpu: 12m
nodeSelector:
my: node
# clean up kubernetes.io/os selector
kubernetes.io/os: null
## Add custom labels only to setup job pod

## Node tolerations for server scheduling to nodes with taints
## Ref: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/
##
tolerations:
- key: null
operator: Exists
effect: "NoSchedule"

## Affinity and anti-affinity
## Ref: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#affinity-and-anti-affinity
##
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app
operator: In
values:
- RELEASE-NAME-sumologic-otelcol-logs
- RELEASE-NAME-sumologic-otelcol-metrics
- RELEASE-NAME-sumologic-otelcol-events
- key: app
operator: In
values:
- prometheus-operator-prometheus
topologyKey: "kubernetes.io/hostname"

podLabels:
some: label
## Add custom annotations only to setup job pod
podAnnotations:
some: annotation

schedule: "*/2 * * * *"
Original file line number Diff line number Diff line change
@@ -0,0 +1,81 @@
---
# Source: sumologic/templates/pvc-cleaner/cron-job-events.yaml
apiVersion: batch/v1
kind: CronJob
metadata:
name: RELEASE-NAME-sumologic-pvc-cleaner-events
namespace: sumologic
labels:
app: pvc-cleaner-events
chart: "sumologic-%CURRENT_CHART_VERSION%"
release: "RELEASE-NAME"
heritage: "Helm"
spec:
schedule: "*/2 * * * *"
jobTemplate:
spec:
template:
metadata:
name: RELEASE-NAME-sumologic-pvc-cleaner-events
labels:
app: pvc-cleaner-events
chart: "sumologic-%CURRENT_CHART_VERSION%"
release: "RELEASE-NAME"
heritage: "Helm"
someSumo: label
some: label
annotations:
someSumo: annotation
some: annotation
spec:
nodeSelector:
kubernetes.io/os: null
my: node
tolerations:
- effect: NoSchedule
key: null
operator: Exists
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app
operator: In
values:
- RELEASE-NAME-sumologic-otelcol-logs
- RELEASE-NAME-sumologic-otelcol-metrics
- RELEASE-NAME-sumologic-otelcol-events
- key: app
operator: In
values:
- prometheus-operator-prometheus
topologyKey: kubernetes.io/hostname
securityContext:
runAsUser: 1000
containers:
- name: pvc-cleaner
image: private.ecr.aws/sumologic/kubernetes-tools:x.y.z
command:
- "bash"
- "/pvc-cleaner/pvc-cleaner.sh"
- "sumologic"
- "app=RELEASE-NAME-sumologic-otelcol-events"
imagePullPolicy: Always
resources:
limits:
cpu: 31m
memory: 1025Mi
requests:
cpu: 12m
memory: 63Mi
volumeMounts:
- name: pvc-cleaner
mountPath: /pvc-cleaner
volumes:
- configMap:
defaultMode: 420
name: RELEASE-NAME-sumologic-pvc-cleaner
name: pvc-cleaner
restartPolicy: Never
serviceAccountName: RELEASE-NAME-sumologic-pvc-cleaner