Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance of async_hooks scope manager #695

Closed
jfirebaugh opened this issue Sep 25, 2019 · 8 comments
Closed

Performance of async_hooks scope manager #695

jfirebaugh opened this issue Sep 25, 2019 · 8 comments
Labels
async_hooks-perf Performance issues related to async_hooks. community question Further information is requested

Comments

@jfirebaugh
Copy link

Describe the bug
In our application we noticed a significant performance overhead from using the default async_hooks-based scope manager. After adding scope: 'noop' to our config, we saw the following improvements.

CPU usage halved:

CPU Usage-2

Significantly smaller GC pauses, with less variance:

Garbage Collection Pause Time

After researching this further, we found that use of async_hooks is known to cause significant performance degradation:

Are there any official recommendations for configuring the node environment and/or dd-trace-js to reduce the overhead of DataDog's async_hooks usage? Is using scope: 'noop' a viable approach? What effects would this have on plugins?

Environment

  • Operation system: Debian
  • Node version: 10.16.3
  • Tracer version: 0.15.2
  • Agent version: 6.13.0
@jfirebaugh jfirebaugh added the bug Something isn't working label Sep 25, 2019
@rochdev rochdev added community question Further information is requested and removed bug Something isn't working labels Sep 25, 2019
@rochdev
Copy link
Member

rochdev commented Sep 26, 2019

Are there any official recommendations for configuring the node environment and/or dd-trace-js to reduce the overhead of DataDog's async_hooks usage?

Unfortunately, async_hooks is the only context propagation construct provided by Node. Some alternatives exist as libraries, but they are no longer maintained, have many known issues, and are discouraged by the Node team. There are a few ways to improve the performance, but in general it means reducing the use of promises either directly in the code, using a transpiler, or switching to a promise library instead of native promises. The performance of async_hooks should improve in future versions of Node, but it's not

Is using scope: 'noop' a viable approach? What effects would this have on plugins?

Plugins rely on having the active span available on the current scope. This is ensured by async_hooks which is used by the scope manager. Using the no-op scope manager would cause spans to end up in disconnected traces, and could cause some plugins to stop working properly.

We used to support an alternative scope manager based on a fork of async-listener which may be more performant. If you want to try it, you would have to temporarily downgrade the tracer to 0.13.1 and configure it using tracer.init({ scope: 'async-listener' }). If you try this approach and it improves the performance without breaking the traces, definitely let me know and we could bring back this scope manager. It's worth noting however that any newer constructs like async/await are not supported by the monkey-patching approach used in async-listener, so even if it works for an isolated test, it may cause issues down the line if you plan to use these new constructs.

@rochdev
Copy link
Member

rochdev commented Sep 26, 2019

One other thing that might be worth testing is if the issue is caused only by async_hooks itself or by the spans that are stored in the scope by dd-trace. You could check that by running your service with tracer.init({ plugins: false }). This will result in no trace being generated at all, so you would be able to isolate only the scope manager, and see the difference between no-op and async_hooks.

@jfirebaugh
Copy link
Author

Thank you for the response @rochdev.

We did test with plugins: false and did not see a noticeable performance difference compared to the baseline (prior to using scope: 'noop'). So the poor performance is almost certainly caused by the use of async_hooks itself.

We make extensive use of promises and async/await, and this is a performance sensitive service, so for now we will use dd-trace-js only for the native metrics and disable tracing ({ scope: 'noop', plugins: false, runtimeMetrics: true }).

@rochdev
Copy link
Member

rochdev commented Sep 27, 2019

Plugins that are usually at the root of the trace can safely be enabled. This usually includes any kind of server such as web frameworks like Express. You can enable these using for example tracer.use('express') since they do not rely on the scope manager to work. This will give you at least some high-level visibility until the performance is addressed with async_hooks.

Sorry that we can't do more for now, we really depend on Node to fix the performance issues in async_hooks for this one. Fortunately this is supposed to happen soon. I'll keep this thread updated as any progress is made.

@rochdev
Copy link
Member

rochdev commented Oct 4, 2019

There was a performance optimization in Node 12 to reduce the memory impact (and thus GC) of async_hooks.

@jfirebaugh Can you try with Node >=12 and let me know if you see better results? This would help a lot as I'm trying to see if there is anything we can do to fix or improve this further directly in Node.

@adityabansod
Copy link

@rochdev thanks for your detailed write up in this issue. just as a note, we are seeing this same issue with using scopes in a lot of promise-y and async code. we deployed Node 12 and then enabled the our DD implementation and there was no performance improvement in Node 12 over Node 10.

you can see in the image below, from the mongodb service's perspective, our ability to push load dropped significantly when we had the dd-scope tracing turned on. this test was done with Node 12 deployed.

Screen Shot 2020-01-03 at 12 40 31 PM

cc @terranblake

@rochdev
Copy link
Member

rochdev commented Jan 6, 2020

@adityabansod Were you able to confirm that the issue is really async_hooks alone and not the tracer in your tests?

@bengl
Copy link
Collaborator

bengl commented May 19, 2021

Closing this one due to inactivity. Feel free to reopen or comment if that makes sense, noting that we also have #1095 open for async_hooks performance.

@bengl bengl closed this as completed May 19, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
async_hooks-perf Performance issues related to async_hooks. community question Further information is requested
Projects
None yet
Development

No branches or pull requests

4 participants