Skip to content
This repository has been archived by the owner on Oct 20, 2022. It is now read-only.

secured nifi cluster : Failed to connect to headless svc host Connection refused #143

Open
omkadmi opened this issue Oct 18, 2021 · 3 comments

Comments

@omkadmi
Copy link

omkadmi commented Oct 18, 2021

Bug Report

What did you do?

I deployed an unsecured nifi cluster -> it works
I deployed a secure nifi cluster with a self-signed certificate (managed by nifikop) -> it works
I have deployed a secure nifi cluster with the cert manager + let'sencrypt -> does not work

I followed all the steps in the documentation https://orange-opensource.github.io/nifikop/blog/2020/06/30/secured_nifi_cluster_on_gcp_with_external_dns, but I still have this connection refused error
knowing that the certificates are issued by the cert manager
I also see the sslnifi entries in Azure private DNS created by extarnalDNS

for info, nifikop, zookeeper and the nifi cluster are in the nifi namspace, cert-manager, letsencrypt and externaldns are in the devops namespace

I have this error in the pod log (which repeats ad infinitum):

Waiting for host to be reachable
failed to reach sslnifi-0-node.sslnifi-headless.mycompany.net:8443
Found: , expecting: 10.66.161.197
Found :
failed to reach sslnifi-0-node.sslnifi-headless.mycompany.net:8443
Found: , expecting: 10.66.161.197
Found :
failed to reach sslnifi-0-node.sslnifi-headless.mycompany.net:8443
Found: , expecting: 10.66.161.197
Found :
failed to reach sslnifi-0-node.sslnifi-headless.mycompany.net:8443
Found: , expecting: 10.66.161.197
Found :
failed to reach sslnifi-0-node.sslnifi-headless.mycompany.net:8443
Found: , expecting: 10.66.161.197
Found :
failed to reach sslnifi-0-node.sslnifi-headless.mycompany.net:8443
Found: , expecting: 10.66.161.197

I have this error in the describe : Failed to connect to sslnifi-0-node.sslnifi-headless.nifi.svc.cluster.local port 8443

Readiness probe failed: * Expire in 0 ms for 6 (transfer 0x557d85ecef50) * Expire in 1 ms for 1 (transfer 0x557d85ecef50) % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0* Expire in 0 ms for 1 (transfer 0x557d85ecef50) * Expire in 1 ms for 1 (transfer 0x557d85ecef50) * Expire in 0 ms for 1 (transfer 0x557d85ecef50) * Expire in 0 ms for 1 (transfer 0x557d85ecef50) * Expire in 0 ms for 1 (transfer 0x557d85ecef50) * Trying 10.66.161.197... * TCP_NODELAY set * Expire in 200 ms for 4 (transfer 0x557d85ecef50) * connect to 10.66.161.197 port 8443 failed: Connection refused * Failed to connect to sslnifi-0-node.sslnifi-headless.nifi.svc.cluster.local port 8443: Connection refused * Closing connection 0 curl: (7) Failed to connect to sslnifi-0-node.sslnifi-headless.nifi.svc.cluster.local port 8443: Connection refused

I don't understand why it is looking in sslnifi-0-node.sslnifi-headless.nifi.svc.cluster.local (which ends in .cluster.local) when it should (I guess) look in .mycompany.net

below the cert-manager log

31
I1018 09:49:32.621252       1 conditions.go:173] Setting lastTransitionTime for Certificate "sslnifi-controller.nifi.mgt.mycompany.net" condition "Issuing" to 2021-10-18 09:49:32.621243958 +0000 UTC m=+1231705.057373825
30
I1018 09:49:32.622054       1 conditions.go:173] Setting lastTransitionTime for Certificate "sslnifi-controller.nifi.mgt.mycompany.net" condition "Ready" to 2021-10-18 09:49:32.622049068 +0000 UTC m=+1231705.058179035
29
I1018 09:49:32.779307       1 conditions.go:173] Setting lastTransitionTime for Certificate "sslnifi-0-node.sslnifi-headless.mycompany.net" condition "Issuing" to 2021-10-18 09:49:32.77929993 +0000 UTC m=+1231705.215429897
28
I1018 09:49:32.781088       1 conditions.go:173] Setting lastTransitionTime for Certificate "sslnifi-0-node.sslnifi-headless.mycompany.net" condition "Ready" to 2021-10-18 09:49:32.781081754 +0000 UTC m=+1231705.217211621
27
E1018 09:49:32.829395       1 controller.go:158] cert-manager/controller/CertificateTrigger "msg"="re-queuing item  due to error processing" "error"="Operation cannot be fulfilled on certificates.cert-manager.io \"sslnifi-controller.nifi.mgt.mycompany.net\": the object has been modified; please apply your changes to the latest version and try again" "key"="nifi/sslnifi-controller.nifi.mgt.mycompany.net"
26
I1018 09:49:32.829670       1 conditions.go:173] Setting lastTransitionTime for Certificate "sslnifi-controller.nifi.mgt.mycompany.net" condition "Issuing" to 2021-10-18 09:49:32.829665191 +0000 UTC m=+1231705.265795058
25
E1018 09:49:32.937119       1 controller.go:158] cert-manager/controller/CertificateTrigger "msg"="re-queuing item  due to error processing" "error"="Operation cannot be fulfilled on certificates.cert-manager.io \"sslnifi-0-node.sslnifi-headless.mycompany.net\": the object has been modified; please apply your changes to the latest version and try again" "key"="nifi/sslnifi-0-node.sslnifi-headless.mycompany.net"
24
I1018 09:49:32.937207       1 conditions.go:173] Setting lastTransitionTime for Certificate "sslnifi-0-node.sslnifi-headless.mycompany.net" condition "Issuing" to 2021-10-18 09:49:32.937203301 +0000 UTC m=+1231705.373333268
23
E1018 09:49:32.964735       1 controller.go:158] cert-manager/controller/CertificateTrigger "msg"="re-queuing item  due to error processing" "error"="Operation cannot be fulfilled on certificates.cert-manager.io \"sslnifi-controller.nifi.mgt.mycompany.net\": the object has been modified; please apply your changes to the latest version and try again" "key"="nifi/sslnifi-controller.nifi.mgt.mycompany.net"
22
I1018 09:49:32.964930       1 conditions.go:173] Setting lastTransitionTime for Certificate "sslnifi-controller.nifi.mgt.mycompany.net" condition "Issuing" to 2021-10-18 09:49:32.964925064 +0000 UTC m=+1231705.401054931
21
E1018 09:49:33.528303       1 controller.go:158] cert-manager/controller/CertificateKeyManager "msg"="re-queuing item  due to error processing" "error"="Operation cannot be fulfilled on certificates.cert-manager.io \"sslnifi-controller.nifi.mgt.mycompany.net\": the object has been modified; please apply your changes to the latest version and try again" "key"="nifi/sslnifi-controller.nifi.mgt.mycompany.net"
20
E1018 09:49:33.913078       1 controller.go:158] cert-manager/controller/CertificateKeyManager "msg"="re-queuing item  due to error processing" "error"="Operation cannot be fulfilled on certificates.cert-manager.io \"sslnifi-0-node.sslnifi-headless.mycompany.net\": the object has been modified; please apply your changes to the latest version and try again" "key"="nifi/sslnifi-0-node.sslnifi-headless.mycompany.net"
19
I1018 09:49:34.942552       1 conditions.go:233] Setting lastTransitionTime for CertificateRequest "sslnifi-controller.nifi.mgt.mycompany.net-bhx8b" condition "Ready" to 2021-10-18 09:49:34.942545097 +0000 UTC m=+1231707.378674964
18
I1018 09:49:35.375578       1 conditions.go:233] Setting lastTransitionTime for CertificateRequest "sslnifi-0-node.sslnifi-headless.mycompany.net-fdrzw" condition "Ready" to 2021-10-18 09:49:35.375571575 +0000 UTC m=+1231707.811701442
17
I1018 09:49:36.157507       1 conditions.go:233] Setting lastTransitionTime for CertificateRequest "sslnifi-0-node.sslnifi-headless.mycompany.net-fdrzw" condition "Ready" to 2021-10-18 09:49:36.157499429 +0000 UTC m=+1231708.593629296
16
E1018 09:49:36.712804       1 controller.go:158] cert-manager/controller/certificaterequests-issuer-acme "msg"="re-queuing item  due to error processing" "error"="Operation cannot be fulfilled on certificaterequests.cert-manager.io \"sslnifi-0-node.sslnifi-headless.mycompany.net-fdrzw\": the object has been modified; please apply your changes to the latest version and try again" "key"="nifi/sslnifi-0-node.sslnifi-headless.mycompany.net-fdrzw"
15
E1018 09:49:37.917592       1 controller.go:158] cert-manager/controller/certificaterequests-issuer-acme "msg"="re-queuing item  due to error processing" "error"="Operation cannot be fulfilled on certificaterequests.cert-manager.io \"sslnifi-controller.nifi.mgt.mycompany.net-bhx8b\": the object has been modified; please apply your changes to the latest version and try again" "key"="nifi/sslnifi-controller.nifi.mgt.mycompany.net-bhx8b"
14
I1018 09:49:38.878739       1 acme.go:184] cert-manager/controller/certificaterequests-issuer-acme/sign "msg"="certificate issued" "related_resource_kind"="Order" "related_resource_name"="sslnifi-controller.nifi.mgt.mycompany.net-bhx8b-3833685911" "related_resource_namespace"="nifi" "related_resource_version"="v1" "resource_kind"="CertificateRequest" "resource_name"="sslnifi-controller.nifi.mgt.mycompany.net-bhx8b" "resource_namespace"="nifi" "resource_version"="v1"
13
I1018 09:49:38.878997       1 conditions.go:222] Found status change for CertificateRequest "sslnifi-controller.nifi.mgt.mycompany.net-bhx8b" condition "Ready": "False" -> "True"; setting lastTransitionTime to 2021-10-18 09:49:38.878992016 +0000 UTC m=+1231711.315121983
12
E1018 09:49:39.372677       1 controller.go:158] cert-manager/controller/CertificateReadiness "msg"="re-queuing item  due to error processing" "error"="Operation cannot be fulfilled on certificates.cert-manager.io \"sslnifi-controller.nifi.mgt.mycompany.net\": the object has been modified; please apply your changes to the latest version and try again" "key"="nifi/sslnifi-controller.nifi.mgt.mycompany.net"
11
I1018 09:49:39.373408       1 conditions.go:162] Found status change for Certificate "sslnifi-controller.nifi.mgt.mycompany.net" condition "Ready": "False" -> "True"; setting lastTransitionTime to 2021-10-18 09:49:39.373401899 +0000 UTC m=+1231711.809531866
10
E1018 09:49:39.593733       1 controller.go:158] cert-manager/controller/CertificateIssuing "msg"="re-queuing item  due to error processing" "error"="Operation cannot be fulfilled on certificates.cert-manager.io \"sslnifi-controller.nifi.mgt.mycompany.net\": the object has been modified; please apply your changes to the latest version and try again" "key"="nifi/sslnifi-controller.nifi.mgt.mycompany.net"
9
E1018 09:49:39.937234       1 controller.go:158] cert-manager/controller/CertificateReadiness "msg"="re-queuing item  due to error processing" "error"="Operation cannot be fulfilled on certificates.cert-manager.io \"sslnifi-controller.nifi.mgt.mycompany.net\": the object has been modified; please apply your changes to the latest version and try again" "key"="nifi/sslnifi-controller.nifi.mgt.mycompany.net"
8
I1018 09:49:39.937992       1 conditions.go:162] Found status change for Certificate "sslnifi-controller.nifi.mgt.mycompany.net" condition "Ready": "False" -> "True"; setting lastTransitionTime to 2021-10-18 09:49:39.937985336 +0000 UTC m=+1231712.374115203
7
I1018 09:49:40.134578       1 acme.go:184] cert-manager/controller/certificaterequests-issuer-acme/sign "msg"="certificate issued" "related_resource_kind"="Order" "related_resource_name"="sslnifi-0-node.sslnifi-headless.mycompany.net-fdrzw-2332423181" "related_resource_namespace"="nifi" "related_resource_version"="v1" "resource_kind"="CertificateRequest" "resource_name"="sslnifi-0-node.sslnifi-headless.mycompany.net-fdrzw" "resource_namespace"="nifi" "resource_version"="v1"
6
I1018 09:49:40.135097       1 conditions.go:222] Found status change for CertificateRequest "sslnifi-0-node.sslnifi-headless.mycompany.net-fdrzw" condition "Ready": "False" -> "True"; setting lastTransitionTime to 2021-10-18 09:49:40.135089423 +0000 UTC m=+1231712.571219390
5
E1018 09:49:40.136239       1 controller.go:158] cert-manager/controller/CertificateKeyManager "msg"="re-queuing item  due to error processing" "error"="Operation cannot be fulfilled on certificates.cert-manager.io \"sslnifi-controller.nifi.mgt.mycompany.net\": the object has been modified; please apply your changes to the latest version and try again" "key"="nifi/sslnifi-controller.nifi.mgt.mycompanyv.net"
4
E1018 09:49:41.572069       1 controller.go:158] cert-manager/controller/CertificateReadiness "msg"="re-queuing item  due to error processing" "error"="Operation cannot be fulfilled on certificates.cert-manager.io \"sslnifi-0-node.sslnifi-headless.mycompany.net\": the object has been modified; please apply your changes to the latest version and try again" "key"="nifi/sslnifi-0-node.sslnifi-headless.mycompany.net"
3
I1018 09:49:41.573131       1 conditions.go:162] Found status change for Certificate "sslnifi-0-node.sslnifi-headless.mycompany.net" condition "Ready": "False" -> "True"; setting lastTransitionTime to 2021-10-18 09:49:41.573123467 +0000 UTC m=+1231714.009253334
2
E1018 09:49:42.044550       1 controller.go:158] cert-manager/controller/CertificateReadiness "msg"="re-queuing item  due to error processing" "error"="Operation cannot be fulfilled on certificates.cert-manager.io \"sslnifi-0-node.sslnifi-headless.mycompany.net\": the object has been modified; please apply your changes to the latest version and try again" "key"="nifi/sslnifi-0-node.sslnifi-headless.mycompany.net"
1
I1018 09:49:42.045275       1 conditions.go:162] Found status change for Certificate "sslnifi-0-node.sslnifi-headless.mycompany.net" condition "Ready": "False" -> "True"; setting lastTransitionTime to 2021-10-18 09:49:42.045269625 +0000 UTC m=+1231714.481399492

below the externaldns log

time="2021-10-13T09:35:52Z" level=info msg="Updating A record named 'sslnifi-int' to '10.66.161.134' for Azure Private DNS zone 'mycompany.net'."
time="2021-10-13T09:35:52Z" level=info msg="Updating A record named 'sslnifi-0-node.sslnifi-int' to '10.66.161.134' for Azure Private DNS zone 'mycompany.net'."
time="2021-10-13T09:35:53Z" level=info msg="Updating TXT record named 'sslnifi-int' to '\"heritage=external-dns,external-dns/owner=<server_name>,external-dns/resource=service/nifi/sslnifi-headless\"' for Azure Private DNS zone 'mycompany.net'."
time="2021-10-13T09:35:53Z" level=info msg="Updating TXT record named 'sslnifi-0-node.sslnifi-int' to '\"heritage=external-dns,external-dns/owner=<server_name>,external-dns/resource=service/nifi/sslnifi-headless\"' for Azure Private DNS zone 'mycompany.net'."
kubectl get all -n nifi
NAME                               READY   STATUS    RESTARTS   AGE
pod/nifikop-int-76cbbff7c6-8fz2g   1/1     Running   0          47h
pod/sslnifi-0-nodexbwjp            0/1     Running   0          4m41s
pod/zookeeper-0                    1/1     Running   0          5d1h

NAME                         TYPE        CLUSTER-IP    EXTERNAL-IP   PORT(S)                       AGE
service/clusterip            ClusterIP   10.0.18.0     <none>        8443/TCP                      36m
service/sslnifi-headless     ClusterIP   None          <none>        8443/TCP,6007/TCP,10000/TCP   36m
service/zookeeper            ClusterIP   10.0.18.121   <none>        2181/TCP,2888/TCP,3888/TCP    5d1h
service/zookeeper-headless   ClusterIP   None          <none>        2181/TCP,2888/TCP,3888/TCP    5d1h

NAME                          READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/nifikop-int   1/1     1            1           47h

NAME                                     DESIRED   CURRENT   READY   AGE
replicaset.apps/nifikop-int-76cbbff7c6   1         1         1       47h

NAME                         READY   AGE
statefulset.apps/zookeeper   1/1     5d1h

below my conf:

apiVersion: nifi.orange.com/v1alpha1
kind: NifiCluster
metadata:
  name: sslnifi
  namespace: nifi
spec:
  service:
    headlessEnabled: true
    annotations:
      external-dns.alpha.kubernetes.io/ttl: "60"
  zkAddress: "zookeeper:2181"
  zkPath: "/sslnifi"
  clusterImage: "apache/nifi:1.12.1"
  oneNifiNodePerNode: false
  managedAdminUsers:
    -  identity : "[email protected]"
       name: "myname"
  managedReaderUsers:
    -  identity : "[email protected]"
       name: "toto"
  propagateLabels: true
  nifiClusterTaskSpec:
    retryDurationMinutes: 20
  readOnlyConfig:
    nifiProperties:
      webProxyHosts:
        - sslnifi-int.mycompany.net
      # Additionnals nifi.properties configuration that will override the one produced based
      # on template and configurations.
      overrideConfigs: |
        nifi.security.user.oidc.discovery.url=https://accounts.google.com/.well-known/openid-configuration
        nifi.security.user.oidc.client.id=xxxxxxxxxxxxxxxxxxx
        nifi.security.user.oidc.client.secret=xxxxxxxxxxxxxxxxxx
        nifi.security.identity.mapping.pattern.dn=CN=([^,]*)(?:, (?:O|OU)=.*)?
        nifi.security.identity.mapping.value.dn=$1
        nifi.security.identity.mapping.transform.dn=NONE
  nodeConfigGroups:
    default_group:
      isNode: true
      storageConfigs:
        - mountPath: "/opt/nifi/nifi-current/logs"
          name: logs
          pvcSpec:
            accessModes:
              - ReadWriteOnce
            storageClassName: "default"
            resources:
              requests:
                storage: 10Gi
        - mountPath: "/opt/nifi/data"
          name: data
          pvcSpec:
            accessModes:
              - ReadWriteOnce
            storageClassName: "default"
            resources:
              requests:
                storage: 10Gi
        - mountPath: "/opt/nifi/flowfile_repository"
          name: flowfile-repository
          pvcSpec:
            accessModes:
              - ReadWriteOnce
            storageClassName: "default"
            resources:
              requests:
                storage: 10Gi
        - mountPath: "/opt/nifi/nifi-current/conf"
          name: conf
          pvcSpec:
            accessModes:
              - ReadWriteOnce
            storageClassName: "default"
            resources:
              requests:
                storage: 10Gi
        - mountPath: "/opt/nifi/content_repository"
          name: content-repository
          pvcSpec:
            accessModes:
              - ReadWriteOnce
            storageClassName: "default"
            resources:
              requests:
                storage: 10Gi
        - mountPath: "/opt/nifi/provenance_repository"
          name: provenance-repository
          pvcSpec:
            accessModes:
              - ReadWriteOnce
            storageClassName: "default"
            resources:
              requests:
                storage: 10Gi
      serviceAccountName: "default"
      resourcesRequirements:
        limits:
          cpu: "2"
          memory: 3Gi
        requests:
          cpu: "1"
          memory: 1Gi
  nodes:
    - id: 0
      nodeConfigGroup: "default_group"
  listenersConfig:
    useExternalDNS: true
    clusterDomain: "mycompany.net"
    internalListeners:
      - type: "https"
        name: "https"
        containerPort: 8443
      - type: "cluster"
        name: "cluster"
        containerPort: 6007
      - type: "s2s"
        name: "s2s"
        containerPort: 10000
    sslSecrets:
      tlsSecretName: "sslnifi-int.mycompany.net-tls"
      create: true
      clusterScoped: true
      issuerRef:
        kind: ClusterIssuer
        name: letsencrypt-staging
  externalServices:
    - name: "clusterip"
      spec:
        type: ClusterIP
        portConfigs:
          - port: 8443
            internalListenerName: "https"
      serviceAnnotations:
        toto: tata

I deployed the nifikop with

helm repo add orange-incubator https://orange-kubernetes-charts-incubator.storage.googleapis.com/
helm install nifikop \
    orange-incubator/nifikop \
    --namespace=nifi \
    --version 0.7.0 \
    --set image.tag=v0.7.0-release \
    --set resources.requests.memory=256Mi \
    --set resources.requests.cpu=250m \
    --set resources.limits.memory=256Mi \
    --set resources.limits.cpu=250m \
    --set namespaces={"nifi"}

I thank you in advance for your help, I've been working on it for a few days, and I don't see any solutions

What did you expect to see?
the pod of the nifi node must have the running stattus 1/1

What did you see instead? Under which circumstances?
the nifi node pod is running 0/1

Environment

  • nifikop version:

0.7.0
the same problem with 0.6.3

  • go version:
  • Kubernetes version information:

V1.19.11

  • Kubernetes cluster kind:

  • NiFi version:

1.12.1

@wandersonpereira
Copy link

I have same problem!

@omkadmi
Copy link
Author

omkadmi commented Nov 22, 2021

nobody has this problem ? it is however a major problem

@wandersonpereira
Copy link

Hello @omkadmi!

Did you resolve this problem?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants