Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Controller manager seems to be not working properly due to cachcing errors #5103

Open
changhyuni opened this issue Aug 23, 2024 · 1 comment
Labels
kind/bug Categorizes issue or PR as related to a bug. needs-priority needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one.

Comments

@changhyuni
Copy link

changhyuni commented Aug 23, 2024

/kind bug

What steps did you take and what happened:

I0823 03:05:37.226125       1 request.go:1212] Response Body: {"kind":"APIResourceList","apiVersion":"v1","groupVersion":"controlplane.cluster.x-k8s.io/v1beta2","resources":[{"name":"awsmanagedcontrolplanes","singularName":"awsmanagedcontrolplane","namespaced":true,"kind":"AWSManagedControlPlane","verbs":["delete","deletecollection","get","list","patch","create","update","watch"],"shortNames":["awsmcp"],"categories":["cluster-api"],"storageVersionHash":"WnEFh7oqH48="},{"name":"awsmanagedcontrolplanes/status","singularName":"","namespaced":true,"kind":"AWSManagedControlPlane","verbs":["get","patch","update"]},{"name":"rosacontrolplanes","singularName":"rosacontrolplane","namespaced":true,"kind":"ROSAControlPlane","verbs":["delete","deletecollection","get","list","patch","create","update","watch"],"shortNames":["rosacp"],"categories":["cluster-api"],"storageVersionHash":"qdhYg8dFBqo="},{"name":"rosacontrolplanes/status","singularName":"","namespaced":true,"kind":"ROSAControlPlane","verbs":["get","patch","update"]}]}
I0823 03:05:37.226379       1 shared_informer.go:337] stop requested
E0823 03:05:37.226393       1 kind.go:68] "controller-runtime/source/EventHandler: failed to get informer from cache" err="Timeout: failed waiting for *v1beta2.AWSManagedControlPlane Informer to sync"
I0823 03:05:37.226424       1 reflector.go:289] Starting reflector *v1beta2.AWSManagedControlPlane (9m13.30993253s) from pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:229
I0823 03:05:37.226445       1 shared_informer.go:337] stop requested
I0823 03:05:37.226459       1 shared_informer.go:337] stop requested
E0823 03:05:37.226464       1 kind.go:68] "controller-runtime/source/EventHandler: failed to get informer from cache" err="Timeout: failed waiting for *v1beta2.AWSManagedControlPlane Informer to sync"
I0823 03:05:37.226448       1 reflector.go:289] Starting reflector *v1beta2.AWSManagedCluster (10m42.211609855s) from pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:229
E0823 03:05:37.226477       1 kind.go:68] "controller-runtime/source/EventHandler: failed to get informer from cache" err="Timeout: failed waiting for *v1beta2.AWSCluster Informer to sync"
I0823 03:05:37.226481       1 reflector.go:295] Stopping reflector *v1beta2.AWSManagedCluster (10m42.211609855s) from pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:229
I0823 03:05:37.226486       1 shared_informer.go:337] stop requested
I0823 03:05:37.226433       1 shared_informer.go:337] stop requested
E0823 03:05:37.226497       1 kind.go:68] "controller-runtime/source/EventHandler: failed to get informer from cache" err="Timeout: failed waiting for *v1beta2.AWSCluster Informer to sync"
I0823 03:05:37.226437       1 shared_informer.go:337] stop requested
E0823 03:05:37.226511       1 kind.go:68] "controller-runtime/source/EventHandler: failed to get informer from cache" err="Timeout: failed waiting for *v1beta1.Machine Informer to sync"
I0823 03:05:37.226449       1 reflector.go:295] Stopping reflector *v1beta2.AWSManagedControlPlane (9m13.30993253s) from pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:229
I0823 03:05:37.226538       1 internal.go:530] "Stopping and waiting for webhooks"
E0823 03:05:37.226499       1 kind.go:68] "controller-runtime/source/EventHandler: failed to get informer from cache" err="Timeout: failed waiting for *v1beta2.AWSManagedCluster Informer to sync"
I0823 03:05:37.226429       1 shared_informer.go:337] stop requested
E0823 03:05:37.226562       1 kind.go:68] "controller-runtime/source/EventHandler: failed to get informer from cache" err="Timeout: failed waiting for *v1beta2.AWSManagedCluster Informer to sync"
I0823 03:05:37.226441       1 shared_informer.go:337] stop requested
E0823 03:05:37.226576       1 kind.go:68] "controller-runtime/source/EventHandler: failed to get informer from cache" err="Timeout: failed waiting for *v1beta2.AWSManagedControlPlane Informer to sync"
I0823 03:05:37.226596       1 server.go:249] "controller-runtime/webhook: Shutting down webhook server with timeout of 1 minute"
I0823 03:05:37.226660       1 internal.go:533] "Stopping and waiting for HTTP servers"
I0823 03:05:37.226688       1 server.go:43] "shutting down server" kind="health probe" addr="[::]:9440"
I0823 03:05:37.226717       1 internal.go:537] "Wait completed, proceeding to shutdown the manager"
E0823 03:05:37.226747       1 logger.go:99] "setup: problem running manager" err="failed to start metrics server: failed to create listener: listen tcp: address 8443: missing port in address"

Environment:
Here my manifest (controller manager)

apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    argocd.argoproj.io/instance: cluster-api
    cluster.x-k8s.io/provider: infrastructure-aws
    control-plane: capa-controller-manager
  name: capa-controller-manager
  namespace: capa-system
spec:
  replicas: 1
  selector:
    matchLabels:
      cluster.x-k8s.io/provider: infrastructure-aws
      control-plane: capa-controller-manager
  template:
    metadata:
      labels:
        cluster.x-k8s.io/provider: infrastructure-aws
        control-plane: capa-controller-manager
    spec:
      containers:
        - args:
            - '--leader-elect'
            - '--feature-gates=EKS=true'
            - '--v=10'
            - '--diagnostics-address=8443'
            - '--insecure-diagnostics=false'
          env:
            - name: AWS_SHARED_CREDENTIALS_FILE
              value: /home/.aws/credentials
          image: >-
            kcr.dev.kabang.cloud/container-registry/external/cluster-api/cluster-api-aws-controller:v2.4.1
          imagePullPolicy: IfNotPresent
          imagePullSecrets: kcr-token
          livenessProbe:
            failureThreshold: 3
            httpGet:
              path: /healthz
              port: healthz
            periodSeconds: 10
          name: manager
          ports:
            - containerPort: 9443
              name: webhook-server
              protocol: TCP
            - containerPort: 9440
              name: healthz
              protocol: TCP
            - containerPort: 8443
              name: metrics
              protocol: TCP
          readinessProbe:
            httpGet:
              path: /readyz
              port: healthz
          securityContext:
            allowPrivilegeEscalation: false
            capabilities:
              drop:
                - ALL
            runAsGroup: 65532
            runAsUser: 65532
          volumeMounts:
            - mountPath: /tmp/k8s-webhook-server/serving-certs
              name: cert
              readOnly: true
      securityContext:
        fsGroup: 1000
        runAsNonRoot: true
        seccompProfile:
          type: RuntimeDefault
      serviceAccountName: capa-controller-manager
      terminationGracePeriodSeconds: 10
      tolerations:
        - effect: NoSchedule
          key: node-role.kubernetes.io/master
        - effect: NoSchedule
          key: node-role.kubernetes.io/control-plane
      volumes:
        - name: cert
          secret:
            defaultMode: 420
            secretName: capa-webhook-service-cert

  • Cluster-api-provider-aws version: v2.4.1
  • Kubernetes version: (use kubectl version): 1.27 (eks)
  • OS (e.g. from /etc/os-release):
@k8s-ci-robot k8s-ci-robot added kind/bug Categorizes issue or PR as related to a bug. needs-priority labels Aug 23, 2024
@k8s-ci-robot
Copy link
Contributor

This issue is currently awaiting triage.

If CAPA/CAPI contributors determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Aug 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. needs-priority needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one.
Projects
None yet
Development

No branches or pull requests

2 participants