Skip to content
This repository has been archived by the owner on Mar 6, 2023. It is now read-only.

certs not updating. leader-election blocked by lock? #167

Open
jkassis opened this issue Jan 3, 2022 · 5 comments
Open

certs not updating. leader-election blocked by lock? #167

jkassis opened this issue Jan 3, 2022 · 5 comments
Labels

Comments

@jkassis
Copy link

jkassis commented Jan 3, 2022

What happened:

  1. Previously working / updating Certificates not updating.
  2. Two instances of Openshift ACME running.
    image
  3. One instance reporting this...
    I0103 00:24:46.571147 1 leaderelection.go:352] lock is held by openshift-acme-7f65979ff9-hgsz4_8f58d3f6-9cf7-4745-af7b-476b0505caa9 and has not yet expired
    I0103 00:24:46.571381 1 leaderelection.go:247] failed to acquire lease fg/acme-controller-locks
  4. Other instance reporting this...
I0103 00:24:16.217493       1 reflector.go:432] k8s.io/[email protected]/tools/cache/reflector.go:108: Watch close - *v1.Route total 0 items received
I0103 00:25:04.539294       1 reflector.go:432] k8s.io/[email protected]/tools/cache/reflector.go:108: Watch close - *v1.LimitRange total 0 items received
I0103 00:25:15.362207       1 reflector.go:432] k8s.io/[email protected]/tools/cache/reflector.go:108: Watch close - *v1.ReplicaSet total 0 items received
I0103 00:26:15.924614       1 reflector.go:432] k8s.io/[email protected]/tools/cache/reflector.go:108: Watch close - *v1.Service total 0 items received
I0103 00:27:04.606876       1 reflector.go:432] k8s.io/[email protected]/tools/cache/reflector.go:108: Watch close - *v1.ConfigMap total 2054 items received
I0103 00:27:30.959775       1 reflector.go:432] k8s.io/[email protected]/tools/cache/reflector.go:108: Watch close - *v1.LimitRange total 0 items received
I0103 00:27:55.497750       1 reflector.go:432] k8s.io/[email protected]/tools/cache/reflector.go:108: Watch close - *v1.Secret total 9 items received

What you expected to happen:
Clean logs and certificates up to date.

How to reproduce it (as minimally and precisely as possible):
Not sure.

Anything else we need to know?:

Environment:

  • OpenShift/Kubernetes version (use oc/kubectl version):
    OKD 4.7.0

image

  • Others:

@tnozicka

@jkassis
Copy link
Author

jkassis commented Jan 3, 2022

seeing this when loading the cert...

[I] jkassis@Jeremys-MBP ~ [124]> ws "wss://pubsub.shinetribe.media/connPut?ConnUUID=b3f0b2d8-f5f8-452c-83fc-c476ecb7a3df"                               01.02 16:36
x509: certificate has expired or is not yet valid: current time 2022-01-02T16:36:11-08:00 is after 2022-01-02T01:42:28Z
[I] jkassis@Jeremys-MBP ~ [1]>                                                                                                                          01.02 16:36

@jkassis
Copy link
Author

jkassis commented Jan 3, 2022

brought the pods down and the "leader election blocked" logs reappear. proceeding as if this is normal. looking at the certificate status, it appears that the cert is up for re-issue on 02-01, which seems odd given that the fetched cert has already expired.

apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
  creationTimestamp: '2021-10-04T02:24:53Z'
  generation: 3
  managedFields:
    - apiVersion: cert-manager.io/v1
      fieldsType: FieldsV1
      fieldsV1:
        'f:spec':
          .: {}
          'f:commonName': {}
          'f:dnsNames': {}
          'f:issuerRef':
            .: {}
            'f:kind': {}
            'f:name': {}
          'f:secretName': {}
      manager: Mozilla
      operation: Update
      time: '2021-10-04T02:38:51Z'
    - apiVersion: cert-manager.io/v1
      fieldsType: FieldsV1
      fieldsV1:
        'f:spec':
          'f:privateKey': {}
        'f:status':
          .: {}
          'f:conditions': {}
          'f:notAfter': {}
          'f:notBefore': {}
          'f:renewalTime': {}
          'f:revision': {}
      manager: controller
      operation: Update
      time: '2021-12-03T01:42:28Z'
  name: pubsub-shinetribe-media
  namespace: fg
  resourceVersion: '307455716'
  selfLink: /apis/cert-manager.io/v1/namespaces/fg/certificates/pubsub-shinetribe-media
  uid: a528dc92-636c-40c8-862e-38dfa6986cc7
spec:
  commonName: pubsub.shinetribe.media
  dnsNames:
    - pubsub.shinetribe.media
  issuerRef:
    kind: Issuer
    name: le-wildcard-issuer
  secretName: cert-pubsub-shinetribe-media
status:
  conditions:
    - lastTransitionTime: '2021-10-04T02:42:30Z'
      message: Certificate is up to date and has not expired
      observedGeneration: 3
      reason: Ready
      status: 'True'
      type: Ready
  notAfter: '2022-03-03T00:44:07Z'
  notBefore: '2021-12-03T00:44:08Z'
  renewalTime: '2022-02-01T00:44:07Z'
  revision: 3

@jkassis
Copy link
Author

jkassis commented Jan 3, 2022

Seems like the algo that determines the renewal time is broken?!? Here's what my browser gets for that cert... roughly 1D off.

image

@tux-o-matic
Copy link

I believe problem has been there all along.
Forced to delete the Pods once in a while to ensure renewal process gets triggered.

@brianorwhatever
Copy link

encountering this issue as well. have tried force deleting the pods and bringing running pods down to 0 and bringing it back up but lock still held by some ghost

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

3 participants