Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

After restarting XMPP server, Jigasi gets stuck "reconnecting" but does not actually reconnect #331

Open
jbg opened this issue Dec 19, 2020 · 3 comments

Comments

@jbg
Copy link

jbg commented Dec 19, 2020

Description

After the XMPP server that Jigasi is connected to for call control is restarted, Jigasi sometimes gets into a loop where it claims that it is reconnecting, but does not actually attempt to reconnect. Restarting Jigasi always solves the problem immediately. During this time, health checks (/about/health on the REST API) continue to succeed, because they only consider whether the SIP and/or transcription services are "ready", and do not look at the XMPP connections at all.

Current behavior

Jigasi fails to reconnect to the XMPP server. The following log is repeated continuously, with backoff initially increasing until it reaches 300s, until Jigasi is manually restarted. As soon as Jigasi is restarted, it connects successfully and returns to normal operation.

INFO: [28] plugin.reconnectplugin.PPReconnectWrapper.scheduleReconnectIfNeeded().425 Reconnect ProtocolProviderServiceJabberImpl(Jabber:jigasi@services.[redacted].net) after 300000 ms.
INFO: [28] plugin.reconnectplugin.PPReconnectWrapper.run().452 Start reconnecting ProtocolProviderServiceJabberImpl(Jabber:jigasi@services.[redacted].net)
SEVERE: [28] impl.protocol.jabber.ProtocolProviderServiceJabberImpl.initializeConnectAndLogin().779 No server addresses found
INFO: [28] plugin.reconnectplugin.PPReconnectWrapper.cancelReconnect().273 Cancel reconnect ReconnectTask [delay=300000, provider=ProtocolProviderServiceJabberImpl(Jabber:jigasi@services.[redacted].net)]
INFO: [28] org.jitsi.jigasi.xmpp.CallControlMucActivator.leaveCommonRoom().348 Leaving call control room: [email protected].[redacted].net pps:ProtocolProviderServiceJabberImpl(Jabber:jigasi@services.[redacted].net)
SEVERE: [28] org.jitsi.jigasi.xmpp.CallControlMucActivator.leaveCommonRoom().362 net.java.sip.communicator.service.protocol.OperationFailedException: Provider not connected to jabber server
  net.java.sip.communicator.service.protocol.OperationFailedException: Provider not connected to jabber server
    at net.java.sip.communicator.impl.protocol.jabber.OperationSetMultiUserChatJabberImpl.assertSupportedAndConnected(OperationSetMultiUserChatJabberImpl.java:566)
    at net.java.sip.communicator.impl.protocol.jabber.OperationSetMultiUserChatJabberImpl.findRoom(OperationSetMultiUserChatJabberImpl.java:315)
    at org.jitsi.jigasi.xmpp.CallControlMucActivator.leaveCommonRoom(CallControlMucActivator.java:354)
    at org.jitsi.jigasi.xmpp.CallControlMucActivator.registrationStateChanged(CallControlMucActivator.java:271)
    at net.java.sip.communicator.service.protocol.AbstractProtocolProviderService.fireRegistrationStateChanged(AbstractProtocolProviderService.java:187)
    at net.java.sip.communicator.impl.protocol.jabber.ProtocolProviderServiceJabberImpl.unregisterInternal(ProtocolProviderServiceJabberImpl.java:1557)
    at net.java.sip.communicator.impl.protocol.jabber.ProtocolProviderServiceJabberImpl.unregisterInternal(ProtocolProviderServiceJabberImpl.java:1545)
    at net.java.sip.communicator.impl.protocol.jabber.ProtocolProviderServiceJabberImpl.unregister(ProtocolProviderServiceJabberImpl.java:1527)
    at net.java.sip.communicator.plugin.reconnectplugin.PPReconnectWrapper.unregister(PPReconnectWrapper.java:367)
    at net.java.sip.communicator.plugin.reconnectplugin.PPReconnectWrapper.reconnect(PPReconnectWrapper.java:350)
    at net.java.sip.communicator.plugin.reconnectplugin.PPReconnectWrapper.registrationStateChanged(PPReconnectWrapper.java:221)
    at net.java.sip.communicator.service.protocol.AbstractProtocolProviderService.fireRegistrationStateChanged(AbstractProtocolProviderService.java:187)
    at net.java.sip.communicator.service.protocol.AbstractProtocolProviderService.fireRegistrationStateChanged(AbstractProtocolProviderService.java:141)
    at net.java.sip.communicator.impl.protocol.jabber.ProtocolProviderServiceJabberImpl.initializeConnectAndLogin(ProtocolProviderServiceJabberImpl.java:783)
    at net.java.sip.communicator.impl.protocol.jabber.ProtocolProviderServiceJabberImpl.register(ProtocolProviderServiceJabberImpl.java:500)
    at net.java.sip.communicator.plugin.reconnectplugin.PPReconnectWrapper$ReconnectTask.run(PPReconnectWrapper.java:454)
    at java.base/java.util.TimerThread.mainLoop(Timer.java:556)
    at java.base/java.util.TimerThread.run(Timer.java:506)
INFO: [28] plugin.reconnectplugin.PPReconnectWrapper.scheduleReconnectIfNeeded().425 Reconnect ProtocolProviderServiceJabberImpl(Jabber:jigasi@services.[redacted].net) after 300000 ms.

Expected Behavior

Jigasi reconnects to the XMPP server, and whilst it cannot, it reports unhealthy.

Possible Solution

Find and resolve the issue with reconnection, and additionally make the health checks consider the XMPP connection state as well as the SIP/transcription gateway "readiness".

Steps to reproduce

  1. Start Jigasi with a call control XMPP connection
  2. Restart the XMPP server

Environment details

  • Latest Jigasi git master, built from source
  • Prosody hg master, built from source
  • Running in a container environment with service discovery, so while Prosody is restarting its DNS name does not resolve. This may be relevant, as "No server addresses found" appears in the logs above. However, once Prosody starts again, its DNS name does resolve, and I have confirmed by manually looking up the name from the Jigasi container that it resolves successfully from that container even while Jigasi is failing to connect to it and logging "No server addresses found". Maybe Jigasi is caching the failed DNS lookup from during the time Prosody was down?
@timowevel1
Copy link

Same problem here

@bgrozev
Copy link
Member

bgrozev commented Nov 30, 2023

Updating jicoco to include jitsi/jicoco#192 might help.

@damencho
Copy link
Member

@bgrozev it is jitsi-desktop taking care of reconnects. It reconnects the protocol providers xmpp and sip and so on.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants