Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dev: utils: Check node is reachable by using both ping and ssh #1563

Merged

Conversation

liangxin1300
Copy link
Collaborator

If both ping and ssh fail, the node is considered unreachable.

To fix issue #1551

Copy link

codecov bot commented Sep 25, 2024

Codecov Report

Attention: Patch coverage is 95.83333% with 1 line in your changes missing coverage. Please review.

Project coverage is 69.77%. Comparing base (685cf0a) to head (3988820).
Report is 13 commits behind head on master.

Files with missing lines Patch % Lines
crmsh/utils.py 95.00% 1 Missing ⚠️
Additional details and impacted files
Flag Coverage Δ
integration 54.58% <95.83%> (+<0.01%) ⬆️
unit 52.59% <54.16%> (-0.02%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
crmsh/bootstrap.py 88.60% <100.00%> (ø)
crmsh/qdevice.py 98.74% <100.00%> (ø)
crmsh/report/core.py 92.03% <ø> (+0.02%) ⬆️
crmsh/ui_node.py 43.29% <100.00%> (ø)
crmsh/utils.py 66.70% <95.00%> (+0.08%) ⬆️

... and 1 file with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@liangxin1300 liangxin1300 marked this pull request as ready for review September 25, 2024 06:10
crmsh/utils.py Outdated

user = userdir.getuser()
rc = check_ssh(user, user, node, timeout=3)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is an overkill to check whether a node is reachable by invoking ssh command. Opening an TCP socket without sending any data is enough.

crmsh/utils.py Outdated
# ping failed, try to connect to ssh port by socket
error_msg = f"host \"{node}\" is unreachable"
try:
with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as sock:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does not work with IPv6. Please use getaddrinfo first.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found we already have utils.check_port_open, should be reused this time

crmsh/utils.py Outdated
rc, _, err = ShellUtils().get_stdout_stderr("ping -c 1 {}".format(node))
if rc != 0:
raise ValueError("host \"{}\" is unreachable: {}".format(node, err))
rc, _, _ = ShellUtils().get_stdout_stderr(f"ping -c {ping_count} {node}")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add -n to avoid triggering a reverse DNS lookup.

@liangxin1300 liangxin1300 force-pushed the 20240924_improve_ping_node branch 2 times, most recently from 54e60c1 to dc69b4d Compare September 26, 2024 13:25
crmsh/utils.py Outdated
rc, _, err = ShellUtils().get_stdout_stderr("ping -c 1 {}".format(node))
if rc != 0:
raise ValueError("host \"{}\" is unreachable: {}".format(node, err))
rc, _, _ = ShellUtils().get_stdout_stderr(f"ping -n -c {ping_count} {node}")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

timeout is not applied to the ping command.

crmsh/utils.py Outdated
@@ -2460,13 +2467,18 @@ def package_is_installed(pkg, remote_addr=None):
return rc == 0


def ping_node(node):
def node_reachable_checking(node, ping_count=1, port=22, timeout=3):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"check" is a noun.

Suggested change
def node_reachable_checking(node, ping_count=1, port=22, timeout=3):
def node_reachable_check(node, ping_count=1, port=22, timeout=3):

@liangxin1300 liangxin1300 merged commit 113b211 into ClusterLabs:master Sep 27, 2024
32 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants