Nifi scale up #139

iordaniordanov · 2021-10-13T09:13:11Z

Type of question

About general context and help around nifikop

Question

What did you do?
Increased number of nodes in the nificluster CR from 3 to 6

What did you expect to see?
3 new nodes to be simultaneously created and joined in the cluster

What did you see instead? Under which circumstances?
3 new nodes were simultaneously created, they join the cluster, but after that they are one by one re-created and only after that the cluster is fully functional, which leads to a linear increase in the amount of time which is needed to scale the cluster up. If adding one node takes 5 min adding 2 nodes takes ~10 min and so on. Is this the expected behavior or it is an issue with our configuration/environment ?

Environment

nifikop version:

v0.6.0
Kubernetes version information:

Server Version: version.Info{Major:"1", Minor:"17+", GitVersion:"v1.17.17-eks-087e67", GitCommit:"087e67e479962798594218dc6d99923f410c145e", GitTreeState:"clean", BuildDate:"2021-07-31T01:39:55Z", GoVersion:"go1.13.15", Compiler:"gc", Platform:"linux/amd64"}
Kubernetes cluster kind:
EKS
NiFi version:
1.12.1

Additional context
Nifi cluster config

apiVersion: nifi.orange.com/v1alpha1
kind: NifiCluster
metadata:
  name: <name>
  namespace: <namespace>
spec:
  clusterImage: <image> # Nifi 1.12.1 image
  externalServices:
  - name: clusterip
    spec:
      portConfigs:
      - internalListenerName: http
        port: 8080
      type: ClusterIP
  initContainerImage: <busybox image>
  listenersConfig:
    internalListeners:
    - containerPort: 8080
      name: http
      type: http
    - containerPort: 6007
      name: cluster
      type: cluster
    - containerPort: 10000
      name: s2s
      type: s2s
    - containerPort: 9090
      name: prometheus
      type: prometheus
  nifiClusterTaskSpec:
    retryDurationMinutes: 10
  nodeConfigGroups:
    default_group:
      isNode: true
      resourcesRequirements:
        limits:
          cpu: "2"
          memory: 6Gi
        requests:
          cpu: "2"
          memory: 6Gi
      serviceAccountName: default
      storageConfigs:
      - mountPath: /opt/nifi/data
        name: data
        pvcSpec:
          accessModes:
          - ReadWriteOnce
          resources:
            requests:
              storage: 30Gi
          storageClassName: general
      - mountPath: /opt/nifi/content_repository
        name: content-repository
        pvcSpec:
          accessModes:
          - ReadWriteOnce
          resources:
            requests:
              storage: 2Gi
          storageClassName: general
      - mountPath: /opt/nifi/flowfile_repository
        name: flowfile-repository
        pvcSpec:
          accessModes:
          - ReadWriteOnce
          resources:
            requests:
              storage: 2Gi
          storageClassName: general
      - mountPath: /opt/nifi/provenance_repository
        name: provenance-repository
        pvcSpec:
          accessModes:
          - ReadWriteOnce
          resources:
            requests:
              storage: 2Gi
          storageClassName: general
      - mountPath: /opt/nifi/nifi-current/work
        name: work
        pvcSpec:
          accessModes:
          - ReadWriteOnce
          resources:
            requests:
              storage: 5Gi
          storageClassName: general
  nodes:
  - id: 0
    nodeConfigGroup: default_group
  - id: 1
    nodeConfigGroup: default_group
  - id: 2
    nodeConfigGroup: default_group
  oneNifiNodePerNode: true
  propagateLabels: true
  readOnlyConfig:
    bootstrapProperties:
      nifiJvmMemory: 2g
      overrideConfigs: |
        java.arg.debug=-agentlib:jdwp=transport=dt_socket,server=y,suspend=n,address=8000
        conf.dir=./conf
    nifiProperties:
      overrideConfigs: |
        nifi.nar.library.autoload.directory=./extensions
        nifi.web.http.network.interface.default=eth0
        nifi.web.http.network.interface.lo=lo
        nifi.web.proxy.context.path=<proxy_path>
        nifi.database.directory=/opt/nifi/data/database_repository
        nifi.flow.configuration.archive.dir=/opt/nifi/data/archive
        nifi.flow.configuration.file=/opt/nifi/data/flow.xml.gz
        nifi.templates.directory=/opt/nifi/data/templates
        nifi.provenance.repository.max.storage.size=2GB
        nifi.provenance.repository.indexed.attributes=te$containerId,te$id
      webProxyHosts:
      - <proxy_host>
    zookeeperProperties: {}
  service:
    headlessEnabled: true
  zkAddress: <zk_addr>
  zkPath: <zk_path>

The text was updated successfully, but these errors were encountered:

iordaniordanov · 2021-11-11T13:56:44Z

Hello, any info here ?

erdrix · 2021-11-12T13:54:04Z

Hello, yes this is the expected behaviour, we are forced to because if all the init cluster nodes are down, it could lead in the situation where the new joining node decides to be the reference, and in this case all information would be erased from the other nodes once they rejoin ...
So we have an init script specific for new node, and once the node has explicitly joined the cluster, we need to restart the pod with a "non-joining" script : https://github.com/Orange-OpenSource/nifikop/blob/master/pkg/resources/nifi/pod.go#L392

iordaniordanov · 2021-11-12T15:39:21Z

Okey, thanks for the clarification :)

iordaniordanov · 2021-11-12T15:43:56Z

I'm sure you thought it trough, but just a suggestion - maybe before scaling up you can check if the cluster reports that it is healthy and if it is not abort the scale operation, because otherwise if someone wants to add lets say 50 nodes because of a spike in usage in this case he needs to wait for multiple hours before all nodes successfully join the cluster ...

erdrix self-assigned this Oct 13, 2021

erdrix added community help wanted Extra attention is needed labels Oct 13, 2021

iordaniordanov mentioned this issue Nov 11, 2021

Support for 1.14.0 #141

Open

This was referenced Mar 24, 2022

Nifi scale up konpyutaika/nifikop#50

Closed

Support for 1.14.0 konpyutaika/nifikop#52

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Nifi scale up #139

Nifi scale up #139

iordaniordanov commented Oct 13, 2021 •

edited

Loading

iordaniordanov commented Nov 11, 2021

erdrix commented Nov 12, 2021

iordaniordanov commented Nov 12, 2021

iordaniordanov commented Nov 12, 2021

Nifi scale up #139

Nifi scale up #139

Comments

iordaniordanov commented Oct 13, 2021 • edited Loading

Type of question

Question

iordaniordanov commented Nov 11, 2021

erdrix commented Nov 12, 2021

iordaniordanov commented Nov 12, 2021

iordaniordanov commented Nov 12, 2021

iordaniordanov commented Oct 13, 2021 •

edited

Loading