Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

centreon_plugins.exe can hang forever #28

Open
UrBnW opened this issue Nov 4, 2020 · 3 comments
Open

centreon_plugins.exe can hang forever #28

UrBnW opened this issue Nov 4, 2020 · 3 comments

Comments

@UrBnW
Copy link
Contributor

UrBnW commented Nov 4, 2020

Hi,

centreon_plugins.exe, called by NSClient++, can hang forever, accumulating processes and memory consumption on the monitored Windows machine.

Below is the faulty behaviour we discovered with an unpatched Net::NTP (see #25 and centreon/centreon-plugins#2129).
But seems to be a more general issue, as I tend to demonstrate below.

With following nsclient.ini configuration :

[/settings/external scripts]
timeout=10

And using centreon_plugins.pl --plugin=apps::protocols::nrpe::plugin --custommode=nsclient --new-api ... as client (should be the case with other clients too).

When launching the centreon_plugins.pl command, we see 2 new centreon_plugins.exe processes appearing :

1

10 seconds later, the centreon_plugins.pl returns with the following :
Command check_centreon_plugins didn't terminate within the timeout period 10s
Additional --debug gives :
{"command":"check_centreon_plugins","lines":[{"message":"Command check_centreon_plugins didn't terminate within the timeout period 10s","perf":{}}],"result":3}

And on Windows side, one of the 2 processes disappears, for sure killed by NSClient++ :

2

Unfortunately, as you can see, the bigger one remains, and it hangs forever.
So seems like NSclient++ does its job killing the external command, perhaps there's an issue with signal handling / forwarding in centreon_plugins.exe itself.

Even with a dummy sleep in the plugin's code as a really simple test case, the second process does not get killed.

Thx 👍

@garnier-quentin
Copy link
Contributor

Maybe it's more like a par::packer issue in fact

@UrBnW
Copy link
Contributor Author

UrBnW commented Dec 8, 2020

So, finally, after some investigation, I see one solution to solve this issue, here it is, for reference.

It should be done at NSClient++ level :
https://github.com/mickem/nscp/blob/0.5.2.41/include/process/execute_process_w32.cpp#L246
Instead of TerminateProcess(pi.hProcess, ..., and then instead of working on the first / parent process only, the whole process tree should rather be proceeded.
Some guideline here : https://stackoverflow.com/questions/1173342/terminate-a-process-tree-c-for-windows

I was also thinking about a solution at PAR::Packer level, catching signals (thanks to the signal() function) and forwarding them to the spawned process :
https://github.com/rschupp/PAR-Packer/blob/1.051/myldr/boot.c#L275
But as per TerminateProcess() definition and documentation, I'm pretty sure handling signals like this won't work here.

@UrBnW
Copy link
Contributor Author

UrBnW commented Jan 4, 2021

Issue opened at NSClient++ level : mickem/nscp#712
Please upvote there so we have a chance for it to be considered.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants