You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
Using a scripted two-site demonstration setup (script included below), the sites initialize but the router pods go into crash-loop-backoff and restart twice before the network finally stablizes.
How To Reproduce
On a cluster, create namespaces demo-dmz-a and demo-dmz-b and run the provided setup script. Watch the pods and sites to observe crashes prior to network stabilization.
Expected behavior
I expect the network to stabilize in an orderly fashion without seeing crash indications.
Environment details
Skupper CLI: None used
Skupper Operator (if applicable): head of the v2 branch
Platform: openshift
Additional context
The error seen in the router log prior to crashing:
2024-10-11 16:02:24.121816 +0000 ROUTER (critical) Router start-up failed: Python: CError: Configuration: Failed to configure TLS caCertFile '/etc/skupper-router-certs/skupper-site-server/ca.crt' from sslProfile 'skupper-site-server'
The router has a new behavior post-3.0.0 (I was running the latest in this test). SslProfiles now load the referenced certificate files immediately upon configuration. The old behavior was to load the certificates upon connection-startup for every new connection.
This means that before an sslProfile is created, all of its referenced files must already exist in the file system.
In Skupper, the config-sync module stores the current configuration, including slProfiles, in the router's config-map. When the router starts up, it will read the configuration mounted from that config-map as the initial configuration. The certificate files, however, are not mounted into the router container. They are copied at run-time into a shared file system by the config-sync container.
This means that there is a race condition at pod-startup. If the router reads its initial configuration before config-sync can store the certificate files, the router will shut down due to the incomplete configuration.
Describe the bug
Using a scripted two-site demonstration setup (script included below), the sites initialize but the router pods go into crash-loop-backoff and restart twice before the network finally stablizes.
How To Reproduce
On a cluster, create namespaces demo-dmz-a and demo-dmz-b and run the provided setup script. Watch the pods and sites to observe crashes prior to network stabilization.
Expected behavior
I expect the network to stabilize in an orderly fashion without seeing crash indications.
Environment details
Additional context
The error seen in the router log prior to crashing:
The script used to reproduce:
The text was updated successfully, but these errors were encountered: