Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ERLANG_NODE set to just a username does not work in recent OTP versions #4288

Open
mzealey opened this issue Oct 2, 2024 · 2 comments
Open
Assignees

Comments

@mzealey
Copy link
Contributor

mzealey commented Oct 2, 2024

Commit fa12301 added the new -sname undefined way of starting up the ejabberdctl module in OTP23+, however it appears to be causing us issues.

Running latest master on OTP26, ejabberdctl script cannot connect to the node if ERLANG_NODE=ejabberd, however it works if ERLANG_NODE=ejabberd@ejabberd. This is being run in a local container where the hostname is set to ejabberd.

It appears that in the failing case, -eval "net_kernel:connect_node('ejabberd')" is being run and failing silently.

A minimal reproducible test case has the following working:

/usr/local/bin/erl -sname undefined -setcookie test-shared-cookie-for-clustering -eval "net_kernel:connect_node('ejabberd@ejabberd')" -s ejabberd_ctl -extra ejabberd
Erlang/OTP 26 [erts-14.2.5.2] [source] [64-bit] [smp:8:8] [ds:8:8:10] [async-threads:1] [jit:ns]

Usage: ejabberdctl [--no-timeout] [--node nodename] [--version api_version] command [arguments]

Available commands in this ejabberd node:
  abort_delete_old_messages host                                      Abort currently running delete old offline messages operation
...

But the following (which is roughly what happens when ERLANG_NODE=ejabberd) fails:

/usr/local/bin/erl -sname undefined -setcookie test-shared-cookie-for-clustering -eval "net_kernel:connect_node('ejabberd')" -s ejabberd_ctl -extra ejabberd
Erlang/OTP 26 [erts-14.2.5.2] [source] [64-bit] [smp:8:8] [ds:8:8:10] [async-threads:1] [jit:ns]

=ERROR REPORT==== 2-Oct-2024::06:25:08.571910 ===

** Cannot get connection id for node ejabberd

Failed RPC connection to the node ejabberd@ejabberd: nodedown

The docs for the file clearly say that this no hostname possibility is allowed, and actually it's very useful to us:

# The next variable allows to explicitly specify erlang node for ejabberd
# It can be given in different formats:
# ERLANG_NODE=ejabberd
#   Lets erlang add hostname to the node (ejabberd uses short name in this case)
@badlop
Copy link
Member

badlop commented Oct 2, 2024

As described in https://www.erlang.org/doc/system/distributed.html#nodes

A node is an executing Erlang runtime system that has been given a name, using the command-line flag -name (long names) or -sname (short names).
The format of the node name is an atom name@host. name is the name given by the user. host is the full host name if long names are used, or the first part of the host name if short names are used.

A smaller reproduction example:

$ erl -sname e3@localhost
(e3@localhost)1>

$ erl -sname undefined -eval "net_kernel:connect_node('e3@localhost')"
(nonode@nohost)1> q().

$ erl -sname undefined -eval "net_kernel:connect_node('e3@localhost')"
(39JC1CT72Q3PG@atenea)1>
User switch command (type h for help)
 --> r e3@localhost
 --> j
   1  {shell,start,[init]}
   2* {e3@localhost,shell,start,[]}
 --> q
$ erl -sname e3
(e3@atenea)1> node().
e3@atenea

$ erl -sname undefined -eval "net_kernel:connect_node('e3@atenea')"
(nonode@nohost)1>
User switch command (type h for help)
 --> r e3@atenea
 --> j
   1  {shell,start,[init]}
   2* {e3@atenea,shell,start,[]}
 --> q

$ erl -sname undefined -eval "net_kernel:connect_node('e3')"
=ERROR REPORT==== 2-Oct-2024::17:16:20.482158 ===

** Cannot get connection id for node e3

In summary, some OTP functions allow to provide only the user part of the nodename (for example the -name and -sname command-line arguments), but other functions require the full nodename (for example the shell r command, and the net_kernel:connect_node function

no hostname possibility is allowed, and actually it's very useful to us:

As explained by the ejabberdctl.cfg documentation, and erlang documentation, and this experiment: the erlang node name always contains the host part, and some tools allow to provide only the user part, then those tools add the host part.

In other words, even if you configure only ERLANG_NODE=ejabberd, the actual node name is ejabberd@machinename. And the actual node name must be provided when calling net_adm:connect_node.

The obvious solution would be to check in ejabberdctl if ERLANG_NODE has just user part, in that case add the host part, to ensure all the user cases will work correctly

no hostname possibility is allowed, and actually it's very useful to us

Is it useful because that allows you to use the same configuration file in several machines which have different machine names? In that case, the obvious solution should work for you too, right?

Example patch:

diff --git a/ejabberdctl.template b/ejabberdctl.template
index 83ec7e1bd..21be6430f 100755
--- a/ejabberdctl.template
+++ b/ejabberdctl.template
@@ -66,6 +66,7 @@ done
 # shellcheck source=ejabberdctl.cfg.example
 [ -f "$EJABBERDCTL_CONFIG_PATH" ] && . "$EJABBERDCTL_CONFIG_PATH"
 [ -n "$ERLANG_NODE_ARG" ] && ERLANG_NODE="$ERLANG_NODE_ARG"
+[ "$ERLANG_NODE" = "${ERLANG_NODE%@*}" ] && ERLANG_NODE="$ERLANG_NODE@$(hostname -s)"
 [ "$ERLANG_NODE" = "${ERLANG_NODE%.*}" ] && S="-s"
 : "${SPOOL_DIR:="{{spool_dir}}"}"
 : "${EJABBERD_LOG_PATH:="$LOGS_DIR/ejabberd.log"}"

@mzealey
Copy link
Contributor Author

mzealey commented Oct 2, 2024

Yes exactly, this sounds like it would work, however because I don't know ejabberdctl script in much depth I'm not sure if modifying $ERLANG_NODE in this way in the script would produce other issues later on (ie should be scoped to a new variable specifically for net_adm:connect_node) or if it is ok to do globally

@badlop badlop self-assigned this Oct 7, 2024
@badlop badlop added this to the ejabberd 24.10 milestone Oct 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants