Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Poller.remove socket disposed fix #835

Open
wants to merge 33 commits into
base: master
Choose a base branch
from

Conversation

jasells
Copy link

@jasells jasells commented Dec 8, 2019

This should fix #834 and a few other similar potential bugs in other versions of Remove(), and RemoveAndDispose()

See a few other comments on a couple of the tests and about throwing ArgumentException from Remove() when socket is disposed. Those should be left to a future PR, though.

@somdoron
Copy link
Member

somdoron commented Dec 8, 2019

I think this is a breaking change, as you are changing a return type.
@drewnoakes what do you think?

@jasells
Copy link
Author

jasells commented Dec 8, 2019

I think this is a breaking change, as you are changing a return type.
@drewnoakes what do you think?

I didn't realize Run was public. BUT, while technically breaking change from the standpoint of the public API changed, any code written to a void return type will be unaffected, since that code isn't expecting anything, there won't be a conflict in the return type.

This is a much better option than leaving a thread race bug in, IMO.

It is also much better practice to return Task from async methods to avoid this very scenario in the first place, and the caller (Remove() in this case) can decide whether they need to await completion of async task or not.

Copy link
Member

@drewnoakes drewnoakes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks very much for working to improve this. Asynchrony is hard to get right, especially when retrofitting it to a synchronous API such as NetMQ's.

I am very concerned about the blocking waits here. These will definitely lead to deadlocks in user code. I feel like a better approach would be to introduce new asynchronous overloads, and potentially obsolete the current synchronous ones.

I think you might find this article interesting as it covers several things I've highlighted in review comments.

any code written to a void return type will be unaffected

It's worth distinguishing between source compatibility (which you're referring to) and binary compatibility (which a lot of people expect). An example of where this code change could break someone's application at run time is in a dependency graph where one package references NetMQ version N, and another references NetMQ version N+1. Only one version can exist at run time, meaning one of those two packages would experience MissingMethodException. The end user would be confused and unable to fix the problem without patching the library packages they're using, which is a hard place to be. When writing public library code like NetMQ you must think in terms of binary compatibility, not source compatibility.

It is also much better practice to return Task from async methods

Agree 100%. For backwards compatibility and correctness, I think we have to leave the current methods as-is and introduce better behaved overloads that work as expected. Also, the .NET convention is that asynchronous methods end in Async, so changing the existing non-Async methods to return Task is not idiomatic and may confuse people. NetMQPoller.RunAsync violates this (which I think is my fault, sorry).

src/NetMQ.Tests/NetMQPollerTest.cs Outdated Show resolved Hide resolved
src/NetMQ.Tests/NetMQPollerTest.cs Outdated Show resolved Hide resolved
src/NetMQ/NetMQPoller.cs Outdated Show resolved Hide resolved
src/NetMQ/NetMQPoller.cs Outdated Show resolved Hide resolved
action();

t = FromResult<object>(null);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than instantiate a new completed task for each invocation, cache an instance in a CompletedTask field and reuse it. You can do that in an internal utility class if you like, maybe called Tasks.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This same pattern is also in existing overloaded method ContainsAsync. Also, I'm not convinced it is a good idea to share TaskCompletionSource instances among (potentially) multiple threads...?

Copy link
Member

@drewnoakes drewnoakes Mar 12, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A completed task is safe to share. Use Task.CompletedTask for TFMs that support it, and cache your own (from a TaskCompletionSource<>) for remaining TFMs.

The benefit of doing so is that this code path will not perform any heap allocations.

src/NetMQ/NetMQPoller.cs Outdated Show resolved Hide resolved
src/NetMQ.Tests/NetMQPollerTest.cs Show resolved Hide resolved
src/NetMQ.Tests/NetMQPollerTest.cs Show resolved Hide resolved
src/NetMQ.Tests/NetMQPollerTest.cs Show resolved Hide resolved
src/NetMQ.Tests/NetMQPollerTest.cs Show resolved Hide resolved
@jasells
Copy link
Author

jasells commented Jan 21, 2020

I haven't forgotten about this PR. Been busy with holidays and getting back to work. I will try to address the concerns raised here this week and issue an update.

@jasells
Copy link
Author

jasells commented Feb 18, 2020

OK, so I pushed up several new commits to (hopefully) address the breaking API concerns, while still fixing the sync issues by adding Task RemoveAsyc() and Task RunAsync() methods and leaving the existing API intact (though still broken, if used).

/// <summary>
/// Run an action on the Poller thread
/// </summary>
/// <param name="action">The action to run</param>
[Obsolete("Queues the action on the poller's thread, but provides no sync mechanism. Please use RemoveAsync() to remove sockets or timers")]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This message needs reviewing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure we should force people to use the new async APIs. What do you think @somdoron?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't force anyone to use the new API's, it just forces them to opt-out (with compiler ignores for release build).

It is strongly suggesting to use the new API's, but that is much better than the status-quo, IMHO: can't dispose sockets at all, just have to rely on GC.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a sticking point? It can easily be removed, but is not my preference... But need clear direction from NetMq team as to the final yay or nay.

What is the preferred method to communicate this issue and change to application coders?

Copy link
Author

@jasells jasells Sep 20, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@drewnoakes So, I should interpret "I'm not sure..." as "I'm not approving this until it is removed" ?

Is moving this text to the summary block acceptable, or is this type of information put in the documentation web site only?

/// <summary>
/// Run an action on the Poller thread
/// </summary>
/// <param name="action">The action to run</param>
[Obsolete("Queues the action on the poller's thread, but provides no sync mechanism. Please use RemoveAsync() to remove sockets or timers")]
public void Run([NotNull] Action action)
{
if (!IsRunning || CanExecuteTaskInline)
action();
else
new Task(action).Start(this);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should use Task.Factory.StartNew.

Copy link
Author

@jasells jasells Mar 31, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is identical to the code in the original Run(Action) method. See here.

Using Task.Factory.StartNew requires several additional parameters in addition to specifying the scheduler:

Task<TResult> StartNew<TResult>(Func<TResult> function, CancellationToken cancellationToken, TaskCreationOptions creationOptions, TaskScheduler scheduler)

Certainly possible, but I'm uncertain what the TaskCreationOptions should be here, nor does the public API supply a CancellationToken.

Adding a cancellationToken parameter would break the existing API, and defining one internally requires creating it even though it won't be used to stop the task, unless an arbitrary timeout is assigned.

**updated to reference the original/master branch

Copy link
Author

@jasells jasells Sep 20, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@drewnoakes I need further clarification here. This request seems to require creating an API change? The current Task(action).Start(this) ensures that the task is run on the poller thread, so I don't see an advantage to using Task.Factory.StartNew.

Is there a compatibility issue with the current impl? If so, how should we handle the extra parameters? Creating and passing a CancellationToken that is never used seems hacky.

@@ -225,7 +263,7 @@ public void Add([NotNull] NetMQTimer timer)
throw new ArgumentNullException(nameof(timer));
CheckDisposed();

Run(() => m_timers.Add(timer));
RunAsync(() => m_timers.Add(timer));
Copy link
Member

@drewnoakes drewnoakes Mar 12, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For completeness, shouldn't we have AddAsync methods that return tasks? These sync methods could call into them and discard the Task.

Is there a reason you added RemoveAsync but not AddAsync?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't run into any problems with the current add implementations...

BUT since you ask, it is probably possible to do

poller.Add(socket);
await poller.RemoveAsync(socket);
socket.Dispose();

and still have issues? Maybe? It is possibly a race condition?

@jasells
Copy link
Author

jasells commented Mar 31, 2020

For some reason, I didn't get any notifications about the comments. I'll look into them over the next few days.

@codecov
Copy link

codecov bot commented Mar 31, 2020

Codecov Report

Merging #835 into master will increase coverage by 0.47%.
The diff coverage is 85.41%.

@@            Coverage Diff             @@
##           master     #835      +/-   ##
==========================================
+ Coverage   67.18%   67.65%   +0.47%     
==========================================
  Files         123      124       +1     
  Lines        8000     8030      +30     
  Branches     1311     1314       +3     
==========================================
+ Hits         5375     5433      +58     
+ Misses       2096     2075      -21     
+ Partials      529      522       -7     
Impacted Files Coverage Δ
src/NetMQ/BufferPool.cs 46.66% <ø> (ø)
src/NetMQ/Core/Utils/AtomicCounter.cs 75.00% <ø> (ø)
src/NetMQ/AtomicCounterPool.cs 80.00% <80.00%> (ø)
src/NetMQ/NetMQPoller.cs 72.39% <86.36%> (-0.44%) ⬇️
src/NetMQ/Msg.cs 72.18% <90.00%> (+2.32%) ⬆️
src/NetMQ/NetMQSelector.cs 67.94% <100.00%> (ø)
src/NetMQ/Core/Utils/Poller.cs 75.00% <0.00%> (-3.58%) ⬇️
src/NetMQ/Core/SocketBase.cs 68.40% <0.00%> (+0.21%) ⬆️
... and 7 more

@jasells
Copy link
Author

jasells commented Jun 16, 2020

OK, so does this PR have a chance, or are you guys ignoring it. If the latter, I'll just fork and move on with my life.

@jzi96
Copy link

jzi96 commented Sep 15, 2020

Any update to this branch and issue? Issue is closed but branch not merged?

Thx

@drewnoakes
Copy link
Member

A bunch of the feedback I left was not addressed. Even still, there's a non-trivial amount of work in reviewing this change for correctness and pandemic life isn't giving me much free time.

@jzi96
Copy link

jzi96 commented Sep 16, 2020

Thanks for the feedback Drew. Good to know. :)

Stay save,
Jan

@jasells
Copy link
Author

jasells commented Sep 16, 2020

Is the only remaining issue the AddAsync comment? I replied, but never got a response. In case it is unclear, I see no need for AddAsync, but they could be added, I suppose. I'll have to refresh my memory at this point, and it looks like there are conflicts from master updates.

@drewnoakes
Copy link
Member

@jasells for me at least, all unresolved comments are still issues for me. In each case my original comment stands.

@stale
Copy link

stale bot commented Apr 16, 2022

This issue has been automatically marked as stale because it has not had activity for 365 days. It will be closed if no further activity occurs within 56 days. Thank you for your contributions.

@stale stale bot added the stale label Apr 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Poller will throw ObjectDisposed exception if Socket.Dispose() called
4 participants