Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem with old downtimes #42

Open
duylong opened this issue May 11, 2020 · 6 comments
Open

Problem with old downtimes #42

duylong opened this issue May 11, 2020 · 6 comments

Comments

@duylong
Copy link

duylong commented May 11, 2020

Hi,

I have a problem with Upcoming downtimes. My old downtimes are still present with the status "Downtime currently running" and "Cancel" button.

upcoming-downtimes

Do you know why?

@duylong
Copy link
Author

duylong commented May 12, 2020

It seems that when worker is down, the worker does not detect the end downtime.
The worker don't have a auto restart when everything goes wrong??

statusengine-worker[13]: Elasticsearch error!
statusengine-worker[13]: Elasticsearch error!
statusengine-worker[13]: No alive nodes found in your cluster
statusengine-worker[13]: No alive nodes found in your cluster
...
statusengine-worker[13]: Elasticsearch error!
statusengine-worker[13]: Elasticsearch error!
statusengine-worker[13]: No alive nodes found in your cluster
statusengine-worker[13]: No alive nodes found in your cluster

No more errors after a restarted service but the obsolete downtimes are still there..

Even if it's a worker problem, In any case the interface should still not display expired downtimes.

@nook24
Copy link
Member

nook24 commented May 19, 2020

This sounds interesting. Basically this is a worker related issue. How ever, the worker forks a MiscChild which will handle Notifications, Downtimes and Acknowledgements and a separate PerfdataChild which process performance data.

Normally whatever happens to the perfdata child should not cause any side effects to other processes / workers.

In any case the interface should still not display expired downtimes.

I will do some investigation into this.

@duylong
Copy link
Author

duylong commented May 20, 2020

Yes I noticed that I had no side effect with the worker errors. My problem is still present, I can't find the source of the problem to reproduce it. Currently I manually clean the MySQL database, it is not very clean.

@nook24
Copy link
Member

nook24 commented May 4, 2021

Is this still a thing?

@duylong
Copy link
Author

duylong commented May 5, 2021

I recently updated to the latest version, I am looking to see if the problem comes back or not ;-)

@duylong
Copy link
Author

duylong commented May 5, 2021

The command "/opt/statusengine/worker/bin/Console.php cleanup" should not clean up old ACK / DOWNTIME?

I still have DOWNTIME from "23:59 12.13.2020"...

Startusengine Cleanup started at: 2021-05-05 09:23:40
Delete old host records
Delete old host check records... done
Delete old host acknowledgements records... done
Delete old host notification records... done
Delete old host state history records... done
Delete old host downtime history records... done
Delete old service records
Delete old service check records... done
Delete old service acknowledgements records... done
Delete old service notification records... done
Delete old service state history records... done
Delete old service downtime history records... done
Delete old misc records
Delete old log entry records... done
Delete old task records... done
Delete old perfdata records for backend elasticsearch done
Cleanup took: 5 seconds...
Startusengine Cleanup finished at: 2021-05-05 09:23:45

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants