Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple instances of RemoteAgentBuffer will cause seg fault #719

Closed
Gamenot opened this issue Mar 30, 2021 · 3 comments · Fixed by #747
Closed

Multiple instances of RemoteAgentBuffer will cause seg fault #719

Gamenot opened this issue Mar 30, 2021 · 3 comments · Fixed by #747
Labels
bug Something isn't working

Comments

@Gamenot
Copy link
Collaborator

Gamenot commented Mar 30, 2021

Description

Instantiating multiple instances of RemoteAgentBuffer will cause a segmentation fault. This is a problem because it blocks having more than one instance of SMARTS from being instantiated in the same process.

Context

I have tested to see that you can have two instances of the Renderer in the same thread so I think this PR accomplishes the intention. I still would prefer to see a simple test for this.

I also understand why there is a rendering lock... I wonder about the implications of it on performance but it is enough that rendering for multiple simulations is possible on the same process.

In addition, what I have found is that the RemoteAgentBuffer currently prevents two instances of SMARTS from existing on the same thread so I am adding that as an issue.

Originally posted by @Gamenot in #706 (comment)

@Gamenot Gamenot added the bug Something isn't working label Mar 30, 2021
@Gamenot Gamenot changed the title Multiple instances of RemoteAgentBuffer cause seg fault Multiple instances of RemoteAgentBuffer will cause seg fault Mar 30, 2021
@Gamenot Gamenot added this to the SMARTS Backlog milestone Mar 31, 2021
@sah-huawei
Copy link
Contributor

I've confirmed this is the case and is related to the version of the grpcio dependency.
Version 1.31.0 causes a segfault in the simple test in PR #745 on this line grpc.channel_ready_future(self._worker_channel).result(timeout=30) in the RemoteAgent __init__ function.
(Using gdb, it can be seen the segfault is caused by a Flush in the grpc C code. Probably a threading bug.)

Versions 1.35.0 and 1.37.0 (the latest) don't segfault.

@sah-huawei
Copy link
Contributor

@Adaickalavan

@sah-huawei
Copy link
Contributor

sah-huawei commented Apr 8, 2021

This is currently the only (remaining) reason we can't run multiple instance of SMARTS in the same process, and so also can't run our tests without --forked cf. Issue #743.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants