Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Robot fails to boot after any issues with RMW config override XML for Cyclonedds #455

Open
lukeopteran opened this issue Nov 14, 2023 · 8 comments
Labels
documentation Improvements or additions to documentation

Comments

@lukeopteran
Copy link

lukeopteran commented Nov 14, 2023

The robot is unable to fully boot (LED keeps spinning with no Chime), with an issues in the XML config. It would be best if it could handle the exception and add an error to the log output.

Examples include setting a non-existing network device ID or setting the wrong network interface parameter for cyclone, e.g. using the Humble format for Galactic robot firmware.

Humble XML

     <General>
        <Interfaces>
          <NetworkInterface name="usb0"/>
        </Interfaces>     
      <AllowMulticast>true</AllowMulticast>
      <MaxMessageSize>65500B</MaxMessageSize>
      <DontRoute>true</DontRoute>
    </General>

Rather than

        <General>
      <Interfaces>
          <NetworkInterface name="usb0" />
      </Interfaces>
      <AllowMulticast>true</AllowMulticast>
      <DontRoute>true</DontRoute>
    </General>

The robot was only recoverable using the api approach to factory reset it, via the ethernet over USB, no hotspot available....

Setup
Robot firmware = G5.2
Pi4 Ubuntu 20.04 inside inside on Usb0

@shamlian
Copy link
Collaborator

shamlian commented Nov 14, 2023

You are right -- the robot application does not start when the DDS XML configuration is malformed (forcing the user to use the webserver to fix the robot). We can look into some kind of fallback, but I am not sure that ROS 2 has an easy/native way to do this with a misconfigured middleware. (This is why editing the XML is considered a "beta feature.")

@lukeopteran
Copy link
Author

@shamlian Thanks, it's quite easy to get the config wrong (especially the Galactic / Humble changes), so a safe fall over would be a really useful / appreciated feature!

@alsora
Copy link
Collaborator

alsora commented Nov 15, 2023

Validating the xml when is written in the webserver won't be easy. We can validate generic xml syntax, but to do more you need a c++ application that uses the dds APIs, there's not even a generic (rmw-agnostic) approach exposed by ros.

I would also not want to fallback to the default xml if there's a problem.
Users will likely not notice that we are falling back and then waste time using a different xml than what they were looking for.

If anything, we should make the cause of the error more prominent and allow fixing it without having to factory reset.

@shamlian
Copy link
Collaborator

shamlian commented Nov 15, 2023

There is no need to factory reset. The user just has to connect to the webserver and delete (or change) their custom XML profile. (And possibly restart the application, if it's not already crashlooping. 😬 ) Deleting the custom profile will cause the robot to fall back to its default.

@alsora
Copy link
Collaborator

alsora commented Nov 15, 2023

Ok, that's good.
I was commenting based on @lukeopteran experience:

The robot was only recoverable using the api approach to factory reset it, via the ethernet over USB, no hotspot available....

We could have a page in the docs that describes how to use this feature, what could go wrong and what's the approach to recover.

@shamlian
Copy link
Collaborator

shamlian commented Nov 15, 2023

That is odd, because the web server was running for the API to be served. It should have been just as possible to instead connect to the web page hosting the XML file and delete it (I could probably toss together a curl command for this). Agree that we could improve the docs.

@shamlian
Copy link
Collaborator

shamlian commented Nov 15, 2023

The call should be as simple as curl -X POST -d "config=" "http://192.168.186.2/rmw-profile-override-save" to clear the custom XML profile and get the robot back. This should work over any active interface (so if you were configuring over Wi-Fi, use that IP address instead). If the robot's application is not crashlooping, it would also be necessary to restart the application with curl -X POST http://192.168.186.2/api/restart-app

@lukeopteran
Copy link
Author

That's great thanks, will give it a go

@shamlian shamlian added the documentation Improvements or additions to documentation label Dec 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

3 participants