Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New Python API #3801

Closed
wants to merge 111 commits into from
Closed

New Python API #3801

wants to merge 111 commits into from

Conversation

omichel
Copy link
Member

@omichel omichel commented Oct 20, 2021

Fixes #2385.

This PR aims at implementing a new python controller API for Webots replacing the previous one with the following advantages:

  1. Don't depend on swig to generate the python library.
  2. Add python type annotations.
  3. More "pythonic" style: use of python getters and setters.
  4. Use Python constructors to instantiate devices instead of the Robot.get_device() method.
  5. Rely on libController instead of libCppController: simpler to maintain (especially data type conversions), less dependencies (like libstdc++) and more lightweight (no need to load libCppController, no overhead in calling C functions through C++ API).
  6. Use of doc-string.
  7. Various improvements of the API (including the supervisor API for managing nodes and fields).
  8. We could get rid of the various lib/python37, lib/python38, lib/python39 folders and keep only one lib/python folder.
  9. And probably more...

@omichel omichel added the feature Implementation of a major feature label Oct 20, 2021
@omichel omichel added this to the R2022a milestone Oct 20, 2021
@Justin-Fisher
Copy link
Contributor

Justin-Fisher commented Oct 20, 2021

Edit: For anyone arriving here late who doesn't want to scroll down through what has now become a great wall of text, here is a link to my current draft of this New Python API. It's still fairly rough, but is quite usable already, and will keep improving. Bug reports and suggestions welcome!

https://github.com/Justin-Fisher/new_python_api_for_webots


Continuing discussion from #2358.

The proposed export mechanism to make constants available in .dll's looks like it would be useful, and this would provide a nice means for other languages' API's to always get up-to-date values for the constants.

I wasn't sure about the names you export them as though. In the C documentation, these are given names beginning upper-case WB_ so I think that would lead someone who uses a .dll version of the C API to expect that these constants would have names beginning 'WB_', whereas you seem to export them with names that instead start lower-case 'wb_'.

Is there a unified list of all Webots constants somewhere (or a list of all the ones that we'd want to export for use in other languages' API's anyway)? This could be a useful checklist if you're looking to add similar exports to the remaining constants. If not, I have available a list of how all the constants appear in the swig/C++ version, and I have an algorithm that does pretty well at translating those names into the C naming scheme which matches the documented names for the dozen or so that I manually checked, but if any constants have idiosyncratic names this algorithm might not handle them correctly, and without a list to check against I don't have a good way to be sure, short of trying to go look them all up in a bunch of different places, which I'm not eager to do. Even if my list isn't perfect, it might still be a useful checklist.

@omichel
Copy link
Member Author

omichel commented Oct 20, 2021

I wasn't sure about the names you export them as though. In the C documentation, these are given names beginning upper-case WB_ so I think that would lead someone who uses a .dll version of the C API to expect that these constants would have names beginning 'WB_', whereas you seem to export them with names that instead start lower-case 'wb_'.

I believe both versions are needed. the wb_ version is a symbol (similar to API function which starts with wb_) whereas the WB_ version is an enum (similar to a define) defined in the header file. The enum version is still needed because it is more convenient to use in C/C++. For example it is not possible to perform:

  switch(node_type) {
    case wb_NODE_CONE:
       ...

Because wb_NODE_CONE is unknown at compile time, whereas it is possible with WB_NODE_CODE as an enum (or define).

@Justin-Fisher
Copy link
Contributor

Also, is the roundabout import into Python using ROTATIONAL = ctypes.c_int.in_dll(_wb, 'wb_ROTATIONAL').value really necessary? I was under the (perhaps mistaken) impression that for vanilla types like ints, ROTATIONAL = _wb.wb_ROTATIONAL would work just fine.

Incidentally, I've set my version up to have a small handler handle API calls so (at least for now) I could use the swig/C++ names for API methods and have the handler automatically translate them to the corresponding C-name (and cache the results so there's no notable performance hit). This approach would support the fairly attractive notation of using _wb.ROTATIONAL as being shorthand for what I repetitively wrote as _wb.wb_ROTATIONAL above, or you wrote as ctypes.c_int.in_dll(_wb, 'wb_ROTATIONAL').value.

@omichel
Copy link
Member Author

omichel commented Oct 20, 2021

The list of constants is computed by the matlab script and they will appear as files in the matlab folder with uppercase filenames (not on github, but in your local installation).

@Justin-Fisher
Copy link
Contributor

Justin-Fisher commented Oct 20, 2021

I believe both versions are needed. the wb_ version is a symbol (similar to API function which starts with wb_) whereas the WB_ version is an enum (similar to a define) defined in the header file.

If one version will be used only under the hood in the C source code (will it?), and the other version will be exported for other people to use, then I would suggest making the version that gets exported have the name that matches the documentation. E.g., call the enumerated local version ROTATIONAL, and call the exported version WB_ROTATIONAL. Or you could change the documentation to note that these will have wb_ names rather than WB_ ones.

@omichel
Copy link
Member Author

omichel commented Oct 20, 2021

Also, is the roundabout import into Python using ROTATIONAL = ctypes.c_int.in_dll(_wb, 'wb_ROTATIONAL').value really necessary? I was under the (perhaps mistaken) impression that for vanilla types like ints, ROTATIONAL = _wb.wb_ROTATIONAL would work just fine.

I just test it but unfortunately, it doesn't work... We have to rely on the more complex version.

@omichel
Copy link
Member Author

omichel commented Oct 20, 2021

If one version will be used only under the hood in the C source code (will it?), and the other version will be exported for other people to use, then I would suggest making the version that gets exported have the name that matches the documentation. E.g., call the enumerated local version ROTATIONAL, and call the exported version WB_ROTATIONAL. Or you could change the documentation to note that these will have wb_ names rather than WB_ ones.

The wb_ names should never be visible to the users, as users will use the object constant without any prefix, e.g., Motor.LINEAR, not wb_LINEAR. So, the wb_ version doesn't need to be documented in the API. It will be used only in the python module (and possibly in other language interfaces in the future).

@Justin-Fisher
Copy link
Contributor

Justin-Fisher commented Oct 20, 2021

The more explicit type conversion like that we have to do, the more annoying it will be to write the Python controller manually, though I can probably jerry-rig a system that hides these extra steps under a convenient notation.

For some reason, the _controller.pyd version of a dll doesn't have these sorts of annoying type-casting issues. E.g., controller.py's analog to your line is just ROTATIONAL=_controller.Motor_ROTATIONAL. I don't really understand all the ways that '.pyd' differs from .dll. Would there be a prospect for making a '.pyd' version of the C-API to import into our new python controller? Might end up being more convenient if it lets us avoid manually recasting a lot of types. (Or it might be more trouble than it's worth. dunno...)

Edit: Reading up a bit more, it looks like .pyd is just an alternate extension for a .dll that defines some stuff to make it easier to import into python, presumably including some cues that indicate the types of constants. Swig probably automatically added those pieces, but as we move away from swig, we probably wouldn't want to worry about adding them to a .dll. So we'll probably be stuck with a fair amount of explicit type-setting, including for constants.

@omichel
Copy link
Member Author

omichel commented Oct 20, 2021

The more explicit type conversion like that we have to do, the more annoying it will be to write the Python controller manually, though I can probably jerry-rig a system that hides these extra steps under a convenient notation.

This will be annoying only for us, developing the webots python module, not for users, but if you can jerry-rig something to make this smoother, that would be nice.

For some reason, the _controller.pyd version of a dll doesn't have these sorts of annoying type-casting issues. E.g., controller.py's analog to your line is just ROTATIONAL=_controller.Motor_ROTATIONAL. I don't really understand all the ways that '.pyd' differs from .dll. Would there be a prospect for making a '.pyd' version of the C-API to import into our new python controller? Might end up being more convenient if it lets us avoid manually recasting a lot of types. (Or it might be more trouble than it's worth. dunno...)

Yes, but generating a .pyd file is more complicated and platform dependent, probably also python-version dependent. This would kill advantage 8 listed in the first comment and would yield to a more complex build process. I would be happy to avoid that.

@omichel
Copy link
Member Author

omichel commented Oct 21, 2021

I just added a helper function to make it easier for us to retrieve constants.

@Justin-Fisher
Copy link
Contributor

Justin-Fisher commented Oct 21, 2021

So I've extracted all the constants from the Matlab controller, and I've gone through all the constants that were used in the swig-produced controller.py, and worked out systematic ways to translate their names to the corresponding Matlab names which themselves claim to have been straight from the documented names. Unfortunately, I did run into the following exceptions (note, for simplicity, I'm leaving out the WB_ prefix throughout):

swig constant expected translation
'Node_ALTIMETER' 'NODE_ALTIMETER'
'Field_MF_ROTATION' 'MF_ROTATION'
'Supervisor_MOVIE_READY' 'SUPERVISOR_MOVIE_READY'
'Supervisor_MOVIE_ENCODING_ERROR' 'SUPERVISOR_MOVIE_ENCODING_ERROR'
'Supervisor_MOVIE_SAVING' 'SUPERVISOR_MOVIE_SAVING'
'Supervisor_MOVIE_WRITE_ERROR' 'SUPERVISOR_MOVIE_WRITE_ERROR'
'Supervisor_MOVIE_RECORDING' 'SUPERVISOR_MOVIE_RECORDING'
'Supervisor_MOVIE_SIMULATION_ERROR' 'SUPERVISOR_MOVIE_SIMULATION_ERROR'

I think the first two are probably errors of omission in the Matlab controller, since it doesn't seem to define anything like these, so perhaps these should be added there?
I think the remaining ones have to do with the C++/swig implementation of supervisor movie control, which would probably have to be re-implemented in a manually written Python controller anyway? Or maybe these are further errors of omission in the Matlab controller?

So, anyway, I've now caught us up to the Matlab controller regarding constant-importation. Whenever constants get changed, the Matlab controller would have to be manually updated. If we're happy enough having the new python controller be like that too (at least for now), then I've got a copy of the matlab constants ready for us to use, and I can automatically-rewrite the swig controller.py to use these. (Of course, it would be nice eventually to have both Matlab and Python import current values for these constants, so that they wouldn't require manual updating whenever constants change.)

Another quality-assurance check I should probably do, since it'd be pretty easy, is checking whether MATLAB and swig/C++ actually do assign the same values to all the constants!

@omichel
Copy link
Member Author

omichel commented Oct 21, 2021

I believe the problems come from the scoping of these constants. In C and Matlab, no scoping is possible hence all the constants are defined at a global scope. In OO languages, such as C++, Java and Python, the constants are scope with objects, e.g., Supervisor.MOVIE_READY or Field.MF_ROTATION which looks better. As a consequence there is no one-to-one match between the C/Matlab implementation and the C++/Java/Python implementations.

However, there is not a huge number of constants and I believe that a minimum of manual maintenance for them is acceptable, as it would be quite difficult to automate this fully.

@Justin-Fisher
Copy link
Contributor

Justin-Fisher commented Oct 21, 2021

Yes, much of the automated name-translation system I came up with was translating between the Matlab/C naming system that keeps everything at global scope, and the swig naming system which also transfers the constants from C++ to Python using constants with global scope, but uses different naming conventions to reflect which local scope each constant will live at in the C++/Python API's.

The exceptions I listed above aren't just further examples of this. The Matlab API doesn't define anything at all for these constants, not just some different way of naming them. The only Matlab constants that don't have analogs among the swig constants are WB_STDERR and WB_STDOUT.

It looks to me like the Matlab API doesn't offer any way of telling whether a node is an Altimeter by comparing it to a constant like WB_NODE_ALTIMETER. As far as I can tell this is just an oversight in the Matlab controller. This should probably ring alarm bells, since the other node types are all created from an enumerated list, so if that list is missing Altimeter, it may be that everything after that in the enumeration is off by one!!! (The docs do say that this should be an enumerated list including Altimeter, so my bet is that the Matlab constants for almost all of the node types -- everything after altimeter on the alphabetized list -- are just wrong!)

Similarly, the Matlab API doesn't define WB_MF_ROTATION even though it does define the analogous WB_SF_ROTATION. I'm not sure how many MF rotation fields there actually are (how often do you want an arbitrarily long list of rotations?), so this may not be all that big of an oversight, but it seems like it probably still is an oversight. Fortunately these ones are assigned values manually, rather than being enumerated, so this likely won't have thrown off the numbering for anything else.

The various MOVIE constants in the swig API look to me like they're probably a remnant from some deprecated way of checking movie statuses, redundant with and superceded by the currently documented movie functions. Since the Python and C++ methods that use these constants (Supervisor.getMovieStatus and its synonym Supervisor.MovieGetStatus) are no longer documented, I think it'd probably be fine for us to abandon them and the associated constants.

@Justin-Fisher
Copy link
Contributor

Justin-Fisher commented Oct 21, 2021

Ok, I have compared the constant values produced by the swig/C++ controller to the corresponding constant values produced by the Matlab controller.

The good news is that most of these match up perfectly, which confirms that my name-translation system is working well, and that the list of constants I've produced is mostly right.

The bad news is that most of the node-type constants are one smaller in the Matlab version than they are in the swig/C++ version of Webots, which, as I speculated above, probably stems from the fact that Matlab left WB_NODE_ALTIMETER out of its enumeration. So In this case, I suspect that the Swig/C++ constants are correct, and the Matlab Controller should be amended to include WB_NODE_ALTIMETER.

Edit: I submitted a PR #3808 to fix the constants in the Matlab generator.

@omichel
Copy link
Member Author

omichel commented May 13, 2022

Thank you for reporting this bug in a separate issue. We will look into it and hopefully fix it soon.

@Justin-Fisher
Copy link
Contributor

Just a note to report that I've been making good progress, with the supervisor/world near-completely tested (as much as I'm planning to, anyway) and filling out a lot of the ordinary robot devices that I'd been delaying. My real life will have some upheaval due to travel over the next two weeks, but I expect to continue making progress on my laptop.

@Justin-Fisher
Copy link
Contributor

What sorts of data do we want Python's Display.image_new() to be able to take as input, and what sort of format does that data need to have when sent to the C API? It looks like it expects unsigned char's, but the documentation doesn't make clear how many of those there should be. Are these 32-bit colors, with 8-bits each for RGBA (in whatever order the specified format requires), so each channel should be translated into the range 0-255? If someone chooses RGB format (with no A), does that shift to 24-bit format, or should it still be 32, just ignoring every 4th byte?

The Display documentation says "WB_IMAGE_BGRA is the recommended one and the image format returned by a Camera device" but then the sample code it gives immediately after that which is supposed to convey a camera image to a display instead uses ARGB for all languages. Which of these is right? The Camera docs also seem inconsistent on this, saying "The image is coded as a sequence of three bytes representing the red, green and blue levels of a pixel" which implies RGB but also "Internal pixel format of the buffer is BGRA (32 bits). Note that the Java API uses little-endian format and stores the pixel integer value in ARGB format."

I've generally been processing Color objects in the Python controller as RGB, since that's how most other Webots devices handle them, either as RGB 24-bit integers, or as RGB Vec3f's with components ranging 0..1. Will Camera images always be BGRA (or perhaps ARGB, if the things that said that are right)? If so, I'm not sure what steps, if any, I should take to reconcile these incompatible formats? E.g., it seems intuitive that reading a pixel from a Camera should return a Color object that could then be used, e.g., as the color for drawing on a display, or lighting up an LED, but if the Camera data isn't ordered RGB this wouldn't automatically work.

In other, somewhat related news, nightly builds on my laptop are having lots of difficulties with textures, saying that jpeg is an unsupported image format, and not showing Display devices correctly in the 3D world, though showing them fine in the pop-up 2D window. Textures used to work fine on my laptop on the stable build I used to have installed, so I'm not sure if this is a problem with my laptop or with nightly builds? (Reinstalling a nightly build didn't fix it, and I couldn't find anything else in my system path that would have been leading to a wrong version of libjpeg, which I think you had suggested as a diagnosis to a similar problem in discord awhile back.) Anyway, until I get that sorted out on my laptop, or get back home to my desktop, my testing of cameras is somewhat limited.

@stefaniapedrazzi
Copy link
Member

Thank you for pointing out the inconsistencies in the documentation, I just fixed them in #4550.

To summarize the current behavior of the C API:

  • Currently Camera.getImage always returns an image with format BGRA on C and Python (mainly for performance reasons).
  • Display.image_new() should be able to take as input images with any of the listed formats (RGB, RGBA, ARGB, BGRA, ABGR) and pass them to the C API with the corresponding format.
    The expected input data in the corresponding C function is a list of unsigned char where if the format has 4 channels then the size of the list will be width x height x 4 and if there are only three channels the size of the expected size of the input is width x height x 3. So the size of the list depends on the format and no bytes are ignored.

@Justin-Fisher
Copy link
Contributor

In asking what sorts of data we'd like the new python controller's version of Display.image_new() to handle, I meant to be asking what datatypes we should accept on the Python side, e.g. python bytes strings, lists of ints, lists of floats, 2D nested lists of ints (or floats or bytes or Colors), 3D nested lists of ints/floats/bytes, ctypes pointers to camera images, Numpy arrays, etc?

I agree that it'd also be best to be able to somehow handle each of the permutations like RGB and BGRA, though that actually divides into two questions: one about which permutation the data given to the Python controller will have, and another about what permutation it should be passed on to the C-API using. I guess the ideal is to present it to the C-API as BGRA, so that no further internal conversion will have to take place? E.g., if someone passes in data that is a 2D nested list of Color vectors (consisting of floats ranging 0..1, ordered RGBA), then the new python API will need to convert these floats to bytes anyway, so it may as well reorder them to BGRA as it does?

@stefaniapedrazzi
Copy link
Member

Ideally, the formats the new Python API should use and support for images (Camera and Display) should be standard ones that will guarantee straightforward compatibility with common image processing libraries, such as opencv and PIL.
That is also why the Python API should be able to support all the different image formats (RGB, RGBA, BGRA, etc.).
So I would say that it should definitely support at least numpy arrays and nested lists as input/output.

Webots copies the received input image independently from the format, but if the image is in BGRA format the copy method is slightly faster.
However I'm not sure that this advantage of passing BGRA images to the C API justifies additional performance costs in the Python API to convert the image and change the input format.
It could be that the faster and more optimal choice is to pass the image as-is to the C API and let Webots convert it but it should be benchmarked.

From my limited personal use of images in Python, I'm currently not very convinced that representing images in the Python API as a list of Color objects is the best approach to achieve the optimal compatibility and ease-of-use.

@Justin-Fisher
Copy link
Contributor

Justin-Fisher commented May 26, 2022

This has become a wall of text, so I'll boldface the questions and important bits.

At a minimum it seems like we should aim to provide backwards compatibility with whatever data for Display.imageNew the old Python API was supposed to support. The old Python API would apparently accept both some sort(s) of lists and some sort(s) of non-lists, but it and the documentation were completely silent about how many dimensions the list was supposed to have, whether it should contain ints or floats, and about what sorts of non-list arguments it would accept. My guess, which perhaps you can confirm, was that the old API probably was meant to accept 3D lists of ints ranging 0-255, and python bytes strings, and nothing else? At any rate, I'll aim to have the new API accept at least those.

If there are other datatypes that the old API didn't accept that are easy enough to implement, I'm happy to add on. I have now set this up to very efficiently handle Numpy arrays and other objects with Python's "buffer protocol", which should include OpenCV, PIL, and potentially also other buffer-like data types that we haven't even thought of. Any other datatypes you would like?

(There may be some wrinkles involving common cases where people slice/stride their Numpy arrays (e.g. to pick out a rectangular sub-region, to transpose an array, or to sample only those pixels whose x and y values are both even), whereas Webots needs the data to be contiguously packed. But I think I may be able to find a general work-around for those too.)

You're right that C is generally faster than Python, but if we still want to accept lists of non-byte numbers (do we?), I think I'll have no choice but to munge them in Python, to convert them to the byte color components that the C-API requires -- or can the C-API for display_image_new accept other number formats for color-components besides bytes? If so, I'll happily pass off the work to C. If not, I'll be stuck doing the conversion in Python, and then it may not add any performance cost to shift to a different permutation like BGRA at the same time. But it may also be that there aren't enough benefits to be worth the hassle.

You seem to think I'm keen to create 2D lists of Color vectors. I'm not. There are two ways I'd most suggest that users of the new API deal with images from Cameras/RangeFinders/Lidars. They could (1) use the new API's container interface to quickly access data straight from the underlying buffer in the C-API, without any need to copy the whole thing into any sort of Python data structure. For the user, this behaves much like a nested list would, e.g., allowing camera[y][x].green but the underlying data just is the webots buffer accessed via ctypes. (This approach is probably best when you'll pay attention only to some select pixels each timestep), Or I would recommend (2) to use device.array to create a Numpy array that also shares memory with the same underlying buffer with no copying at all, which you can then use for all kinds of fast Numpy-based processing on the whole image. My only reason for (3) providing an option to slowly construct a copy of the image in Python nested lists was for backwards compatibility with the old API. Since Webots did, and presumably still will, provide this as an option, it seems like we should expect that someone might then send such data to Display.image_new, (as the sample program given at the end of the Display docs did) so it seemed to me that we probably should accept lists to display_image_new, if only for backwards-compatibility, no? (I'm also offering (4) an option to copy the camera image to a Python bytes string, for backwards compatibility, though this is usually strictly worse than (1) or (2). And it might make sense (5) to allow some mechanism for copy-pasting straight from a Camera to a Display ImageRef without needing to pass the data itself through Python. I could implement this for any horizontal band spanning the whole camera, but quick-copying narrower selections would probably require changes within Webots, or making extra copies of data.)


And some more questions about display images.

Are images supposed to be tied to particular displays, or could the same image be pasted to multiple displays? If they're not supposed to be tied to particular displays, why do image_new and image_load require a Display tag? And in either case, why do image_save and image_delete require a Display tag? (On the Python side, I ended up opting to make each image remember the display that created it, but not require that you mention that display when doing things that just involve that image, so e.g. you just use image.save(filename) without mentioning its display. For image.paste() I currently have an optional display argument in case you want to allow pasting images to different displays, but by default it will paste itself into the display that created it.)

When a controller stops, is it important that the controller explicitly image_delete all display images, or will Webots take care of that automatically?

Is it fine to image_delete a Display image later in the very same timestep where you create and paste it? This seems to work fine, as far as I can tell, and would remove a needless layer of complexity from the Webots vision how-to sample, which incidentally I now have working nicely in the new Python API, including super-efficiently passing data from Webots to numpy/opencv (no copying needed!) and back to Webots (no copying needed on my end, but I think Webots makes an internal copy of the buffer I send it). Yay! You can check it out here if you like: https://github.com/Justin-Fisher/new_python_api_for_webots

@stefaniapedrazzi
Copy link
Member

Here are some answers

My guess, which perhaps you can confirm, was that the old API probably was meant to accept 3D lists of ints ranging 0-255, and python bytes strings, and nothing else? At any rate, I'll aim to have the new API accept at least those.

Yes, the current Python API accepts two different inputs where the type of each pixel value is a 'char' or integer 0-255.

  1. 3D list with of the form width x height where width and height are automatically extracted from the list
  2. plain arrays of data where the width and height are passed as mandatory arguments

I have now set this up to very efficiently handle Numpy arrays and other objects with Python's "buffer protocol", which should include OpenCV, PIL, and potentially also other buffer-like data types that we haven't even thought of. Any other datatypes you would like?

I think it is fine.

if we still want to accept lists of non-byte numbers (do we?)

I'm not sure why we should accept lists of non-bytes numbers.
I think that for the first version of the API you should aim at a functionality comparable with the previous one.
Additional functionalities should be added later.

can the C-API for display_image_new accept other number formats for color-components besides bytes?

No. Only chars (integer 0-255) are supported.

it seemed to me that we probably should accept lists to display_image_new, if only for backwards-compatibility, no?

It is straightforward to convert lists to Numpy array. We could deprecated the lists as input but still continue to support it for backward compatibility.

And it might make sense (5) to allow some mechanism for copy-pasting straight from a Camera to a Display ImageRef without needing to pass the data itself through Python. I could implement this for any horizontal band spanning the whole camera, but quick-copying narrower selections would probably require changes within Webots, or making extra copies of data.

In the first version of the new Python API I think it could be better to stick to the current functionalities.
Additional functionalities can be added later.

Are images supposed to be tied to particular displays, or could the same image be pasted to multiple displays? If they're not supposed to be tied to particular displays, why do image_new and image_load require a Display tag? And in either case, why do image_save and image_delete require a Display tag?

Images are passed only to the Display device with the corresponding tag.
New images and loaded images are only accessible by a single Display, that is why the tag is needed for saving and deleting.
Note that all the images are stored in the Display instance on the Webots side, so to tell the correct Display to save or delete an image the controller also needs to send the tag.
This is arguable but it is out of the scope of this PR to change the behavior.
It is important that the Python API has the same basic behavior as the C API, personally I don't see any good reason to remove the tag from the Python function inputs. This would just create some confusion.

When a controller stops, is it important that the controller explicitly image_delete all display images, or will Webots take care of that automatically?

Yes, Webots clears all the images when the world is unloaded.

Is it fine to image_delete a Display image later in the very same timestep where you create and paste it?

Yes.

@Justin-Fisher
Copy link
Contributor

What is the preferred way to provide sample controllers in multiple languages for the various included Webots sample worlds?

  1. If we add a python controller alongside a C controller in the same controller folder, Webots will always pick the same one to run, so people won't have any way to run the other.
  2. Should we leave the worlds the same, and just bloat the folder of controllers with new controller versions with their language included in their name? (I have however been slightly altering sample worlds, e.g. to set the controller to a python one, or to give the controller supervisor powers so that in can put informative labels onscreen.)
  3. Or should there just be another project of sample worlds + controllers that are all python samples?

I'm not sure how many of these samples I'll end up making New Python API versions of. Thus far I've just been re-doing the ones that changed more in the new API, like supervisor ones, or ones with high-bandwidth devices like cameras and radar.

@omichel
Copy link
Member Author

omichel commented Jun 1, 2022

What is the preferred way to provide sample controllers in multiple languages for the various included Webots sample worlds?

  1. If we add a python controller alongside a C controller in the same controller folder, Webots will always pick the same one to run, so people won't have any way to run the other.

By default, Webots will run the executable file if any (compiled from a C or C++ source). If not found, it will search for a .class or .jar java binary controller. If not found, it will search for a .py controller. If not found, it will search for a .m MATLAB controller. If not found, it will fail. So if you have both a C and a Python controller in the same controller folder, simply performing a "make clean" from the Webots GUI will clear the executable files if any and reverting the simulation will launch the .py python controller. So, it could be a nice option to have both the C and the Python controller in the same folder and document that users have to clean the C binary in order to be able to launch the Python controller. This could be documented in a local README.md file. The only drawback of this approach is that it will clutter sample controllers with code that is not always used, but having the C and Python code counterpart side-by-side is great from a pedagogical point of view.

  1. Should we leave the worlds the same, and just bloat the folder of controllers with new controller versions with their language included in their name? (I have however been slightly altering sample worlds, e.g. to set the controller to a python one, or to give the controller supervisor powers so that in can put informative labels onscreen.)

This is something we did with the nao_demo_python example which is the counterpart of the nao_demo C controller. However, we did not provide a corresponding world file for the Python version, so users have to change manually the controller from the C one to the Python one, which is not ideal. Making a copy of the world file is also bad because it introduce a lot of redundancy. Finally, having the _python suffix in the controller name is a bit heavy. So, I wouldn't recommend this path.

  1. Or should there just be another project of sample worlds + controllers that are all python samples?

That might be a good idea. We started this inside the python project folder. It would be possible to put more example in there as well. The big advantage of this approach is that python simulations would run out-of-the-box and controller folders wouldn't be cluttered with C code. I see two drawbacks: (a) world files will be duplicated (redundancy and maintenance problem) and (b) the comparison between the C and Python controller is more difficult than in (1).

I'm not sure how many of these samples I'll end up making New Python API versions of. Thus far I've just been re-doing the ones that changed more in the new API, like supervisor ones, or ones with high-bandwidth devices like cameras and radar.

I would recommend to go for approach (1), as it seems to be the best one to me.

Note: we decided a long time ago to give priority to binary code in controllers (instead of Python code) simply because when you install Webots on Windows, there is not necessarily Python installed and hence the Python controllers will be unable to start, giving a bad first impression. We could however reconsider this. To avoid the bad impression effect, we could simply test for the existence of Python and if not present we would not attempt to run the Python controller, but run the binary controller instead. However, this means that every controller (at least in the guided tour) should have at least a binary version to ensure it will always start-up on Windows without Python.

Note: the code for the controller language priority is here.

@Justin-Fisher
Copy link
Contributor

Justin-Fisher commented Jun 2, 2022

For now I'm accumulating finished variants of Webots samples in my work-in-progress version of the new API. I have been preserving copies of the old C / C++ versions alongside, mostly in case I or anyone else wanted to compare. At some point we can decide whether to merge these in with the samples for other languages (which would require adding supervisor powers to some of the sample robots) or just to include them in a separate folder of python samples.

Yeah it makes sense to have the auto-tour not try to run things that won't run on a given computer. It's actually given bad impressions to me and my students the way it is, because (a) the executables that you have it run often trigger anti-virus checks and Webots often crashes waiting for the check to complete, and (b) because one of the early samples is the drone-flying one which, for some reason, is super processor-intensive and runs extremely sluggishly on many computers that can run other simulations just fine. So there's definitely room for improvement in first impressions from what the auto-tour plays.

I noticed in the radar sample that the radar device gives negative azimuth readings for targets in the +y direction. That feels backwards to me, and it is backwards to how rotations around the radar's perpendicular +z axis are usually measured in Webots. Is that the way it's supposed to work? I was tempted to flip over the radar in the sample, which would make it give correct-feeling .azimuth readings with respect to the robot's axes, but of course they'd still be wrong-feeling with respect the the radar's own axes. For now, I've instead just added a .angle property to radar.Target that returns the negation of the target's .azimuth, which makes this more consistent with other uses of angles, including the 4th .a component in the radar's .rotation field (when aligned to look out horizontally along with the robot's xy plane), and the Vector.angle of the target in the frame of reference of the radar's axes.

I don't yet understand Lidar.Point.layer_id. If I understand Lidar right, the PointCloud is another array with the same superficial 2D structure as the basic Lidar range image, except its "pixels" are Lidar.Point structures rather than simple floats. So is point.layer_id just the row of point in this 2D array? It seems odd to have each array element store its own row-index, when that information is already implicitly encoded by the element's position within the array, and when accessing these elements typically required already knowing (or at least being easily capable of computing) what row-index you were accessing. E.g. we don't have Camera pixels have a field indicating what row of the image they're in, so why do PointCloud pixels do this? And if they need an indicator of row, why not also have an indicator of column? (Or maybe that's effectively what the .time attribute does?) This would make more sense if the PointCloud could be sparse and include fewer Points for layers that had some points out of range. But the docs say "(each layer is assumed to have the same number of points associated to)" which seems to imply that the point cloud will be rectangular rather than sparse, and you would expect the wb_layer_get_point_cloud function to have an associated way of reading the number of points in that layer if different layers could have different numbers of points or even be empty. I also haven't been able to get a lidar to claim to have anything less than its full number of points, even when I point it up in the sky. It might have helped if the Lidar sample program did anything more with Lidar than just enable it (and also if it didn't crash Webots twice before finally running)... It also seems like lidar is super-buggy if you ever try to just reset the simulation -- have to hard-reload the world to get it to work most of the time.

@omichel
Copy link
Member Author

omichel commented Jun 3, 2022

Thank you for this feedback. It is indeed very useful. We are currently very busy preparing the new release of Webots R2022b scheduled for the end of June. Hopefully, we will have time to take into account this feedback.

@Justin-Fisher
Copy link
Contributor

Justin-Fisher commented Jun 4, 2022

Okay, it took quite a bit of wrestling but I now have lidar point clouds working, and a much more informative sample lidar for the new python API, showing how to transfer lidar range/cloud images to numpy, how to do a bit of processing on them, e.g. to paste back into webots display images and/or to do simple wayfinding. I can now confirm that the answer to my above question is that the lidar cloud is just a full rectangular image (though with some 'inf' values that cause a sort of sparseness), and that its .layer_id is redundant with the row index that people would generally already know prior to accessing a point. I ended up making the default transfer of point-clouds to numpy arrays transfer just the xyz information since it's hard to do vectorized ops on heterogeneous structures, and the other info (like a redundant specification of layer-index) usually wouldn't be all that useful anyway. If you weren't bound by backwards compatibility, I would suggest either removing layer_id from Lidar points since it isn't really needed, or at least shifting it after all the floats, to make it easier for people to treat this as a (strided) array of floats.

Webots docs sometimes say that a returned pointer points to memory managed by the simulation and should not be released by the controller. They do not explicitly say this for the pointer to an array of RecognitionObjects returned by wb_camera_recognition_get_objects, though it seems likely that this is a similar case. What sort of memory-management should I be trying to do with these? Should I assume that this array will also automatically be disposed of (or overwritten) by Webots after this timestep (or is it until step() is called, or until after the recognition sampling period expires?), so I needn't worry about disposing of it myself, and should just copy out whatever info I'll want this timestep before it disappears or gets overwritten? Or since the docs don't say this, does that mean that the recognition objects will linger until I do something to say I'm done with them, and if so what should I say and when should I say it?

Also related to such pointers, one design decision I'm running into is when to make a copy of pointed-to information? For high-bandwidth devices like Cameras and Rangefinders, I had not been intentionally copying information at all (aside from offering a .copy() method to do so), and just retrieving info on demand straight from the shared memory. In the case of Rangefinders rf[0,0] would return the distance at those coordinates in the rangefinder image, converted to a python float. In the case of Cameras, cam[0,0] would return a ColorBGRA object that is a ctypes array with useful helper methods like being able to access components as .red etc. However, when ctypes creates this ColorBGRA object, it does not automatically copy or convert any info, and instead just returns a tiny sub-array that shares memory with 4 bytes of the shared-memory image. This could sometimes lead to unexpected results, e.g. if someone stores a color they observed in the image, and Webots updates the image stored there, then the stored ColorBGRA object's color components would change accordingly. My inclination is to say that it would be better, in this case, to return a stable copy of the pixel's color info, rather than something that still shares memory.

Anyway it could help to know:

  1. How long will pointed-to information remain valid (including images, arrays of RecognitionObjects, lists of recogntion colors within RecObjects)? Does it expire at end of timestep? When step() is called with a positive value? When the sampling period for the device expires?
  2. When the pointer is no longer assured to be valid, is it predictable what other information will have taken its place? E.g., if what will have taken its place is just a new image for the same camera, it could be construed as a feature rather than a bug that camera[0,0] gives you a little window upon that pixel which always remains up to date.
  3. How much usage of shared memory are we willing to live with? I think I'm generally fine with documenting that certain big items that are clearly tied to a device (like a Recognition Object or a numpy array version of a camera image) share memory and will be valid only for the timestep, and count on users not to try to retain them for longer. But I think it's generally bad to share memory on smaller more generic things, like colors or RecObject positions, that users might easily end up storing without realizing that it shares memory with something else. So that would mean I'll need to add some complexity to my current implementation for camera images, RecObjects and perhaps PointClouds to have them make copies before returning such small items.

Update: I've now gotten it set up to work as I described in 3. I think the only shared-memory objects that ordinary users should now receive are Camera.Recognition.Objects (no efficient way to avoid this!), and numpy array versions of images, which intentionally share memory for speed. Otherwise I've tried to wrap access to shared memory in container-like interfaces that will keep themselves up-to-date rather than expiring, and that will wait to copy values out of shared memory upon demand. (There are also straightforward ways to access the raw shared-memory versions of these containers, which would be a bit faster, but the docstrings discourage novices from using those.) Hopefully that sounds fine?

@Justin-Fisher
Copy link
Contributor

In other news, I'm happy to report that I think my version of the New Python API is now quite close to being feature-complete. I.e., there aren't any more big things that I know I need to add to it. There may however be many bugs, especially because I haven't yet exposed all the devices to thorough testing.

The main items that remain on my to-do list for this are:

  • debugging, ideally with the help of you or other beta-testers
  • setting up some mechanisms for more backwards compatibility (e.g. allowing from controller import Robot; robot = Robot() as an old-fashioned alternative to import robot, and incorporating this into Webots own file structure rather than just importing it from a project directory as the current version does)
  • copying documentation out of docstrings into python tabs of webots documentation

It should be quite feasible to have this all done by end of month if you wanted? Or it might make sense to gather input from more beta testers first. Up to you!

@omichel
Copy link
Member Author

omichel commented Jun 7, 2022

It's great to finally see the first version of this new python API! Congratulations!
I just opened a new PR to fix the documentation about the life cycle of the camera recognition objects array, see #4603. I hope this answers your question about it.
I am currently very busy chasing bugs in the release we are preparing, but I will for sure review your work as soon as possible.
Meanwhile, you can go ahead with the remaining list of things to do.
I can also post some announcement on our usual communications channels (e.g., Discord, Twitter and LinkedIn) to call for beta testers. Do you think it is a good time for this now?

@Justin-Fisher
Copy link
Contributor

Sure, it'd be good to get feedback from whatever testers you can find. It's currently far from perfect though, but it's getting there, and hopefully they'll understand that there may still be lots of bugs.

@omichel
Copy link
Member Author

omichel commented Jun 9, 2022

All right, then I will post some announcements about it.

@Justin-Fisher
Copy link
Contributor

Justin-Fisher commented Jun 10, 2022

I noticed that I left out Node.contact_points. The documentation doesn't include anything like wb_node_get_number_of_contact_points but I imagine it probably should, since that seems to be how Webots generally provides lists of elements? EDIT: Or maybe this one really is supposed to work differently, setting the number of returned points to the destination of the given *size pointer? (Tis unfortunate that the *size part of the documentation gets hidden in the part you have to pull the scrollbar to see, making it easy to be misled as I initially was, but I guess we're stuck with that...)

Does Webots keep track of two separate contact points tracking statuses: one for including descendants, one for not? So you could be tracking one of these, or the other, or both? Or can there be at most one sort of tracking happening at a time, and either it includes descendants or it doesn't? (I implemented the new Python API in a way that effectively presumes the latter "just one sort of tracking at a time" reading, but perhaps that was mistaken?)

@omichel
Copy link
Member Author

omichel commented Jun 10, 2022

Tis unfortunate that the *size part of the documentation gets hidden in the part you have to pull the scrollbar to see, making it easy to be misled as I initially was, but I guess we're stuck with that...

Yes, unfortunately.

Does Webots keep track of two separate contact points tracking statuses: one for including descendants, one for not? So you could be tracking one of these, or the other, or both? Or can there be at most one sort of tracking happening at a time, and either it includes descendants or it doesn't? (I implemented the new Python API in a way that effectively presumes the latter "just one sort of tracking at a time" reading, but perhaps that was mistaken?)

Yes, you are correct, only one sort of tracking can be active at a time.

@omichel omichel modified the milestones: R2022b, R2022b-rev1 Jul 8, 2022
@omichel
Copy link
Member Author

omichel commented Sep 15, 2022

Now that we are done with the release of Webots R2022b, I wanted to inquire on the status of this PR. Did you receive any feedback from users? Did you progress with the documentation? I would like to move forward integrating this new API in Webots R2023a. I guess the first step would be to create a PR here that would basically import your code from https://github.com/Justin-Fisher/new_python_api_for_webots and add the documentation in the Webots user guide and reference manual. Shall we go ahead this way?

@omichel
Copy link
Member Author

omichel commented Oct 5, 2022

I would like first to try to re-implement the current Python API without swing to improve the user experience for Python users (our library won't be tied to any specific version of Python) and simplify our Python workflow. So, I will close this PR and open a new one on the same branch to go ahead with this implementation. This should facilitate the implementation of the new API, which could be implemented in a second step, once this branch is merged in the develop branch (or even earlier, on a new branch targeting this branch).

@omichel omichel closed this Oct 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Implementation of a major feature
Development

Successfully merging this pull request may close these issues.

3 participants