Skip to content
This repository has been archived by the owner on Oct 20, 2022. It is now read-only.

[BUG] NiFi config cannot be reloaded if the operator crashes and restarts in the middle of reconciliation #130

Open
srteam2020 opened this issue Sep 9, 2021 · 0 comments
Milestone

Comments

@srteam2020
Copy link

srteam2020 commented Sep 9, 2021

Bug Report

What did you do?
We find that the NiFi operator sometimes fails to update the config specified by the user if it crashes and restarts at some particular point.

More concretely, the user can update the NiFi config by patching/updating the NifiCluster CR (e.g. set nifiProperties to nifi.ui.banner.text=xxx). Ideally, the NiFi operator will update the config delta to configmap, and restart the related NiFi pod later. Detailed steps are listed here:

  1. the operator checks whether there is any config change from the CR spec. if there is change:
  2. the operator updates configmap data to make it consistent with the CR spec
  3. the operator updates ConfigOutofSync to CR status
  4. the operator gets the CR status and check whether ConfigOutofSync is set. If true, it deletes the pod (nifi node), and restarts it in next round of reconcile
  5. when the pod restarts, the operator sets CR status to ConfigInSync.

However, we find that if the operator crashes between step 2 and step 3 and restarts, the NiFi pod will never get restarted and the new config will never be populated to NiFi correctly. The reason is that after step 2, the configmap is already consistent with the CR spec, but the CR status is not set to ConfigOutofSync yet. After restart, the operator finds there is no change in step 1, and will never try to set ConfigOutofSync again.

What did you expect to see?
The NiFi pod starts to use the new config.

What did you see instead? Under which circumstances?
The new config is not reloaded successfully.

Environment

  • nifikop version: 1546e02
  • go version: go version go1.13.9 linux/amd64
  • Kubernetes version information: v1.18.9

Possible Solution
A simple fix to this would be switching step 2 and step 3. In that case, the operator still knows the config is out of sync even if it crashes at the middle, and the config will be reloaded after the operator restarts.

Additional context
We are willing to send a PR to help fix this bug.

@srteam2020 srteam2020 changed the title [BUG] NiFi config cannot be reloaded if the operator crashes in the middle of reconciliation [BUG] NiFi config cannot be reloaded if the operator crashes and restarts in the middle of reconciliation Sep 9, 2021
@erdrix erdrix added this to the 0.7.1 milestone Sep 17, 2021
@erdrix erdrix modified the milestones: 0.7.1, 0.8.0 Nov 12, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants