Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

multiprocessing.Manager() leaves child process alive after deleting Model object. #149

Open
twj8CDC opened this issue Mar 7, 2024 · 5 comments

Comments

@twj8CDC
Copy link

twj8CDC commented Mar 7, 2024

Hello,

I am not quite sure if this can be considered a bug, but I thought I would share. Feel free to close if this is too much of an edge issue.

I am running a simulation using the bart model that involved creating and deleting the pymc model (with a bart component) in each iteration. I noticed that as I went through iterations I would accumulate python processes that were no longer using CPU but appeared to hold memory(~50-100mb).

When many iterations were done I started having OOM issues due to these processes gradually taking up memory. These processes would die once the main process dies.

These processes were not the multi-chain/multi-thread processes used in the training/inference (those associated processes were spun-up/down correctly).

I believe the issue comes from the multiprocessing.Manager() used to create the 'all_trees' list.

To resolve the issue I used the following codeblock after each iteration was complete.

import multiprocessing as mp

childs = mp.active_children()
    for child in childs:
        child.kill()

This resolves the issue of lingering processes.

I am not sure if this should be considered bug or not, since it only becomes an issue when a high number of bart models are being created in a single python script. And I don't know if there is really a good general solution to resolving this issue, because if you kill the child process created by the Manager to early, I would expect there to be issues with further use of the model.

That being said, I could see other users running into this issue if doing a highly iterative process and generally I would say that it having a process that doesn't die when the model is deleted is unexpected behavior. So I just wanted to share my experience for future users reference.

Feel free to close or remove this submission if it is unhelpful.

Thanks!

@aloctavodia
Copy link
Member

Hi, thanks for sharing. I think this is a bug even when it will only affect a portion of the users and also this is related to the issues people have been observing on Mac. Not sure of a good general solution either.

@twj8CDC
Copy link
Author

twj8CDC commented Mar 8, 2024

One potential solution could be to capture the PID of the manager when it is created (in the BART class). Then add a deconstructor (del) that will kill that process when the class is deleted.

A simple example of this

import multiprocessing as mp
import psutil as ps

# create class
class c1():
    def __init__(self):
        self.a = 1
        manager = mp.Manager()
        # collect the pid for the manager 
        self.process = ps.Process(manager._process.ident)
        self.lst = manager.list()
        
    def __del__(self):
        print("DELETING PROCESS")
        self.process.kill()
    
    def get_process_id(self):
        print(self.process)

class c2():
    def __init__(self):
        self.c11 = c1()
        print("CREATED A NEW MANAGER")
        print(self.c11.get_process_id())
# create an instance of class with Mangaer
c11 = c1()

# print the process id
print("This is the manager pid")
print(c11.get_process_id())
# print the active children (process id should match)
print("Above should be in this list")
print(mp.active_children())
print("Deleting the object will kill the manager process")
del c11
print("The list shouldn't contain the process")
print(mp.active_children())
# works when class is contained in another class
c22 = c2()
mp.active_children()
del c22
mp.active_children()

As far as I can tell the BART class instance persists through the use of the higher level model instance, so I wouldn't expect there to be any unexpected behavior of this process being killed prior to the deletion of the model instance. And based on this simple example I believe that deletion of the model instance will result in the BART instance to be deleted and the process to be properly killed. But I also am not super familar with all of the PYMC internals, so this approach could also cause some unexpected issues.

@twj8CDC twj8CDC closed this as completed Mar 8, 2024
@twj8CDC twj8CDC reopened this Mar 8, 2024
@aloctavodia
Copy link
Member

Would you like to give it a try and send a PR?

@twj8CDC
Copy link
Author

twj8CDC commented Mar 13, 2024

Yeah sure. Might be a few weeks before I can get to it, but I will give it a try.

@aloctavodia
Copy link
Member

Thank you! Take your time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants