Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

using improv with fastplotlib for visualization #85

Open
kushalkolar opened this issue Mar 16, 2023 · 5 comments
Open

using improv with fastplotlib for visualization #85

kushalkolar opened this issue Mar 16, 2023 · 5 comments
Assignees

Comments

@kushalkolar
Copy link
Contributor

kushalkolar commented Mar 16, 2023

I've been playing around with this for the past week and I have some questions and a possible issue.

  1. This line prevents nexus.startNexus() from being run within a jupyter notebook, it seems like jupyter has its own asyncio running:

res = loop.run_until_complete(self.pollQueues()) #TODO: in Link executor, complete all tasks

This is what I get when I try to call nexus.startNexus() within jupyter. Various workarounds suggested here such as %autoawait off or this make the startNexus() call blocking which is not what we want.

Note that the line numbers in the traceback don't match your dev branch because I added some comments, line 184 in the traceback is 181 in your dev branch.

improv.nexus Starting processes
improv.nexus <ForkProcess name='Generator' parent=2441387 initial daemon>
improv.nexus <ForkProcess name='Processor' parent=2441387 initial daemon>
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
Cell In[3], line 1
----> 1 n.startNexus()

File ~/Insync/kushalkolar@gmail.com/drive/repos/improv/improv/nexus.py:184, in Nexus.startNexus(self)
    180     loop.add_signal_handler(
    181         s, lambda s=s: self.stop_polling(s, loop)) #TODO
    182 try:
    183     # This does not work with jupyter
--> 184     res = loop.run_until_complete(self.pollQueues()) #TODO: in Link executor, complete all tasks
    185 except asyncio.CancelledError:
    186     logging.info("Loop is cancelled")

File /usr/lib/python3.10/asyncio/base_events.py:622, in BaseEventLoop.run_until_complete(self, future)
    611 """Run until the Future is done.
    612 
    613 If the argument is a coroutine, it is wrapped in a Task.
   (...)
    619 Return the Future's result, or raise its exception.
    620 """
    621 self._check_closed()
--> 622 self._check_running()
    624 new_task = not futures.isfuture(future)
    625 future = tasks.ensure_future(future, loop=self)

File /usr/lib/python3.10/asyncio/base_events.py:582, in BaseEventLoop._check_running(self)
    580 def _check_running(self):
    581     if self.is_running():
--> 582         raise RuntimeError('This event loop is already running')
    583     if events._get_running_loop() is not None:
    584         raise RuntimeError(
    585             'Cannot run the event loop while another loop is running')

RuntimeError: This event loop is already running

To get a prototype working I just added a return after the start() call here: kushalkolar@1efaa11#diff-5572e082ebe8c66e061e05fea29c3eaec5036c7af0c9fd8ca273f245e97614e0R173

I then just call nexus.setup() and nexus.run() instead of using the improv CLI (since it interferes with the jupyter asyncio described above).

I'm not sure what's the best longterm solution, perhaps creating an actor process that can then be connected to in jupyterlab might work? I think it's possible to do this but I've never tried.

  1. I made a visualization demo based on minimal and it seems to kind of work: https://github.com/kushalkolar/improv/tree/dev/demos/fastplotlib
improv-fpl-2023-03-16_08.09.10.mp4

Questions:

1-3 are my main questions to get me going right now, 4 & 5 are useful to think about for the future.

  1. Are there any downsides to returning the nexus.startNexus() other than loosing the CLI: kushalkolar@1efaa11#diff-5572e082ebe8c66e061e05fea29c3eaec5036c7af0c9fd8ca273f245e97614e0R173
  2. How do I remove an item from a q_in or q_out to free up RAM?
  3. I'm using this in order get the data in the queue from an actor, is this the right way?
    n.actors["Generator"].q_out.get()
  4. In the current example all I'm doing is the Generator actor is producing random data which is being received and plotted in the current kernel (and not from within an actor). I think the overhead for the remote frame buffer widget is quite low so I would like to use it like this for now.
    • In the future I could potentially create an Actor to manage the remote frame buffer (or multiple remote frame buffers) for visualizations, but from my experience it is quite fast that I don't see that this is necessary. I'd have to dig into the WGPU canvases quite a bit to figure out how to do that.
    • An alternative is to start a ipython kernel which can be connected to from a jupyter notebook and is able to access the queues and get items.
  5. I think the best way I see this being use for visualization is that we have actors that perform preprocessing (OnACID or behavior, etc) or analysis, and each of these actors dump data into a "visualization queue" that we can access from the jupyter kernel.
    • For calcium imaging & behavior we can just show the most recently added frame, keep the queue size as 1.
    • For other visualizations (neural activity for example) we can just use a FIFO queue to append to an existing visualization.

By the way it'd be useful if the codebase uses type annotations, makes it easier for newcomers to navigate 😄 (hope I'm not stepping on any toes here, I still can't figure out what type of object q_in and q_out are)

@jmxpearson
Copy link
Contributor

jmxpearson commented Mar 16, 2023

First of all, thanks for this! Really cool stuff!

I can answer some of these, and @draelos might be able to address the others.

If the error is because Jupyter wants to own the event loop, then the "right" behavior here would be to simply schedule these tasks in Jupyter's event loop. However, we need to reinsert tasks after they get finished, since we need to keep polling. I think improv needs to own this.

That is, in answer to your first question: no, this breaks everything. It means the server can't handle any incoming messages from any actor. We use pollQueues to handle inputs and respond.

But, I would like to suggest an alternative approach. Instead of improv run, which launches the CLI, you could simply launch improv server, which will boot up the server listening on ports. That is, just run

! improv server <your yaml file>

and it will spit out the ports it's on. That will take place in a totally separate subprocess, and you can connect to it via zmq and send messages rather than call methods directly (just like Jupyter itself does). I think it would be preferable to having users accessing server internals.

I realize that the current model is multiprocessing queues and that what I'm suggesting would not be shared memory. (But are we sure message-passing is too slow for this case?) I suspect it is is possible to have Nexus schedule tasks in Jupyter's running loop, but a) I'm not immediately sure how to do it and b) it sort of inverts the design of the package (where improv, not Jupyter is the controller), and that may be problematic in ways I don't anticipate.

It seems perfectly reasonable to me to have an actor that's a writeable visualization buffer, though. I'd have to think hard (and can brainstorm with you) how this might work.

By the way it'd be useful if the codebase uses type annotations, makes it easier for newcomers to navigate 😄 (hope I'm not stepping on any toes here, I still can't figure out what type of object q_in and q_out are)

Agreed. But someone has to do it, and my current priorities (on a very limited time budget) are needed features, tests, and docs, in that order.

@jmxpearson jmxpearson self-assigned this Mar 16, 2023
@kushalkolar
Copy link
Contributor Author

Thanks for all the details!

I'll try out the server approach. Is there a zmq example which uses message passing for the links instead of queues?

I think that the most elegant way to do this would be to have actors that can be accessed from jupyter. It must be possible to launch a jupyter lab kernel, and then tell the actor to use that kernel. This would be nice because then you can have multiple notebooks corresponding to actors to visualize different parts of the running "improv graph", instead of just converging a bunch of links to a single visualization actor.

Where does improv handle the creation of actors? If there's a way to have actors connect with jupyter kernels I'll try playing around with that.

@draelos
Copy link
Collaborator

draelos commented Mar 16, 2023

I can chime in on more details later, but wanted to make point out we had a student mock up how to grab data from the central store in order to plot things in jupyter notebooks: https://github.com/project-improv/improv/tree/jupyter_vis/demos/jupyter

Actors are created here: https://github.com/project-improv/improv/blob/main/improv/nexus.py#L185

@kushalkolar
Copy link
Contributor Author

Thanks for the jupyter example, sorry I completely missed that! So it seems like you used it with zmq and live plotting!

Even if message passing is relatively slow compared to queues it doesn't matter since we need visualizations only at 30-60Hz, not hundreds. So I'll give this a try first before the other more complex ideas!

@jmxpearson
Copy link
Contributor

jmxpearson commented Mar 16, 2023 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants