-
Notifications
You must be signed in to change notification settings - Fork 130
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix class/resource leak in :nessie-quarkus:test
#9441
Comments
So, good news first: there are
But there are
|
Discussion on Quarkus Zulip |
The leak's caused by In JUnit, tests are run from the single If a test somehow "touches" a dependency that creates a thread local, that thread local is attached to the Even though everything else is properly cleaned up before the next test w/ a different Quarkus test class loader starts, the thread locals are still referenced by the Here's an output that (should) illustrate the issue. It's generated by some hacky code that inspects the
|
Just documenting the extremely hacky code used to identify the issue... public class TempBase {
@BeforeAll
public static void foo() throws Exception {
System.out.println();
System.out.println();
System.out.println("*** START UP");
System.out.println();
System.out.println();
dump();
}
@AfterAll
public static void cleanup() throws Exception {
System.out.println();
System.out.println();
System.out.println("*** CLEAN UP");
System.out.println();
System.out.println();
dump();
}
private static void dump() throws Exception {
Map<ClassLoader, List<Thread>> threadsByClassLoader = new IdentityHashMap<>();
Thread t1 = null;
for (Thread thread : Thread.getAllStackTraces().keySet()) {
if (thread.threadId() == 1) {
t1 = thread;
}
threadsByClassLoader
.computeIfAbsent(thread.getContextClassLoader(), cl -> new ArrayList<>())
.add(thread);
}
threadsByClassLoader.forEach(
(cl, t) -> {
if (cl == null) {
return;
}
System.out.println(cl);
t.sort(Comparator.comparing(Thread::getName).thenComparing(Thread::getId));
for (Thread thread : t) {
System.out.println(" " + thread.threadId() + " " + thread.getName());
}
});
if (t1 != null) {
dumpThreadLocals(t1);
}
Thread t = Thread.currentThread();
if (t != t1) {
dumpThreadLocals(t);
}
}
private static void dumpThreadLocals(Thread t)
throws NoSuchFieldException, IllegalAccessException {
Map<ClassLoader, List<Object>> threadLocalsByClassLoader = new IdentityHashMap<>();
Field threadLocalsField = Thread.class.getDeclaredField("threadLocals");
threadLocalsField.setAccessible(true);
Object threadLocals = threadLocalsField.get(t);
Field tableField = threadLocals.getClass().getDeclaredField("table");
tableField.setAccessible(true);
Object[] entries = (Object[]) tableField.get(threadLocals);
for (Object entry : entries) {
if (entry != null) {
Field valueField = entry.getClass().getDeclaredField("value");
valueField.setAccessible(true);
Object value = valueField.get(entry);
if (value != null) {
threadLocalsByClassLoader
.computeIfAbsent(value.getClass().getClassLoader(), cl -> new ArrayList<>())
.add(value);
Field referentField = Reference.class.getDeclaredField("referent");
referentField.setAccessible(true);
Object referent = referentField.get(entry);
if (referent != null
&& referent
.getClass()
.getName()
.equals("java.lang.ThreadLocal$SuppliedThreadLocal")) {
Field supplierField = referent.getClass().getDeclaredField("supplier");
supplierField.setAccessible(true);
Object supplier = supplierField.get(referent);
if (supplier != null) {
threadLocalsByClassLoader
.computeIfAbsent(supplier.getClass().getClassLoader(), cl -> new ArrayList<>())
.add(supplier);
}
}
}
}
}
System.out.println("THREAD LOCALS OF " + t);
threadLocalsByClassLoader.forEach(
(cl, objs) -> {
if (cl == null) {
return;
}
System.out.println(cl);
objs.sort(
Comparator.comparing(o -> o.getClass().getName())
.thenComparing(System::identityHashCode));
for (Object obj : objs) {
System.out.println(" " + obj.getClass().getName() + " " + obj);
}
});
}
} |
See projectnessie#9441 for a complete description of the underlying problem. TL;DR is: `ThreadLocal`s from various 3rd party libraries leak into the single `Test worker` thread that runs all the tests, resulting in TL objects/suppliers from the various Quarkus test class loaders, eventually leading to nasty OOMs. This change updates the `MultiEnvTestEngine` by using the new `ThreadPerTestClassExecutionExecutorService` and also "assimilate" really all tests, even the non-multi-env tests, so that those also run on a thread per test-class. Since each test class runs on "its own thread", the `ThreadLocal`s are registered on that thread. Once the test class finishes, the thread is disposed and its thread locals become eligible for garbage collection, which is what is needed. Fixes projectnessie#9441
See projectnessie#9441 for a complete description of the underlying problem. TL;DR is: `ThreadLocal`s from various 3rd party libraries leak into the single `Test worker` thread that runs all the tests, resulting in TL objects/suppliers from the various Quarkus test class loaders, eventually leading to nasty OOMs. This change updates the `MultiEnvTestEngine` by using the new `ThreadPerTestClassExecutionExecutorService` and also "assimilate" really all tests, even the non-multi-env tests, so that those also run on a thread per test-class. Since each test class runs on "its own thread", the `ThreadLocal`s are registered on that thread. Once the test class finishes, the thread is disposed and its thread locals become eligible for garbage collection, which is what is needed. The bump of the max-heap size for test workers is also reduced back to 2g (was changed in projectnessie#9433). Fixes projectnessie#9441
See projectnessie#9441 for a complete description of the underlying problem. TL;DR is: `ThreadLocal`s from various 3rd party libraries leak into the single `Test worker` thread that runs all the tests, resulting in TL objects/suppliers from the various Quarkus test class loaders, eventually leading to nasty OOMs. This change updates the `MultiEnvTestEngine` by using the new `ThreadPerTestClassExecutionExecutorService` and also "assimilate" really all tests, even the non-multi-env tests, so that those also run on a thread per test-class. Since each test class runs on "its own thread", the `ThreadLocal`s are registered on that thread. Once the test class finishes, the thread is disposed and its thread locals become eligible for garbage collection, which is what is needed. The bump of the max-heap size for test workers is also reduced back to 2g (was changed in projectnessie#9433). Fixes projectnessie#9441
See projectnessie#9441 for a complete description of the underlying problem. TL;DR is: `ThreadLocal`s from various 3rd party libraries leak into the single `Test worker` thread that runs all the tests, resulting in TL objects/suppliers from the various Quarkus test class loaders, eventually leading to nasty OOMs. This change updates the `MultiEnvTestEngine` by using the new `ThreadPerTestClassExecutionExecutorService` and also "assimilate" really all tests, even the non-multi-env tests, so that those also run on a thread per test-class. The logic to distinguish multi-env from non-multi-env tests via `MultiEnvExtensionRegistry.registerExtension()` via test discovery is not 100% deterministic, it can add multi-env tests to the non-multi-env tests, so an additional check is needed there. Since each test class runs on "its own thread", the `ThreadLocal`s are registered on that thread. Once the test class finishes, the thread is disposed and its thread locals become eligible for garbage collection, which is what is needed. The bump of the max-heap size for test workers is also reduced back to 2g (was changed in projectnessie#9433). Fixes projectnessie#9441
See projectnessie#9441 for a complete description of the underlying problem. TL;DR is: `ThreadLocal`s from various 3rd party libraries leak into the single `Test worker` thread that runs all the tests, resulting in TL objects/suppliers from the various Quarkus test class loaders, eventually leading to nasty OOMs. This change updates the `MultiEnvTestEngine` by using the new `ThreadPerTestClassExecutionExecutorService` and also "assimilate" really all tests, even the non-multi-env tests, so that those also run on a thread per test-class. The logic to distinguish multi-env from non-multi-env tests via `MultiEnvExtensionRegistry.registerExtension()` via test discovery is not perfect (but good enough), it can add multi-env tests to the non-multi-env tests, so an additional check is needed there. Since each test class runs on "its own thread", the `ThreadLocal`s are registered on that thread. Once the test class finishes, the thread is disposed and its thread locals become eligible for garbage collection, which is what is needed. The bump of the max-heap size for test workers is also reduced back to 2g (was changed in projectnessie#9433). Fixes projectnessie#9441
See projectnessie#9441 for a complete description of the underlying problem. TL;DR is: `ThreadLocal`s from various 3rd party libraries leak into the single `Test worker` thread that runs all the tests, resulting in TL objects/suppliers from the various Quarkus test class loaders, eventually leading to nasty OOMs. This change updates the `MultiEnvTestEngine` by using the new `ThreadPerTestClassExecutionExecutorService` and also "assimilate" really all tests, even the non-multi-env tests, so that those also run on a thread per test-class. The logic to distinguish multi-env from non-multi-env tests via `MultiEnvExtensionRegistry.registerExtension()` via test discovery is not perfect (but good enough), it can add multi-env tests to the non-multi-env tests, so an additional check is needed there. Since each test class runs on "its own thread", the `ThreadLocal`s are registered on that thread. Once the test class finishes, the thread is disposed and its thread locals become eligible for garbage collection, which is what is needed. The bump of the max-heap size for test workers is also reduced back to 2g (was changed in projectnessie#9433). Fixes projectnessie#9441
Related JUnit issue |
See projectnessie#9441 for a complete description of the underlying problem. TL;DR is: `ThreadLocal`s from various 3rd party libraries leak into the single `Test worker` thread that runs all the tests, resulting in TL objects/suppliers from the various Quarkus test class loaders, eventually leading to nasty OOMs. This change updates the `MultiEnvTestEngine` by using the new `ThreadPerTestClassExecutionExecutorService` and also "assimilate" really all tests, even the non-multi-env tests, so that those also run on a thread per test-class. The logic to distinguish multi-env from non-multi-env tests via `MultiEnvExtensionRegistry.registerExtension()` via test discovery is not perfect (but good enough), it can add multi-env tests to the non-multi-env tests, so an additional check is needed there. Since each test class runs on "its own thread", the `ThreadLocal`s are registered on that thread. Once the test class finishes, the thread is disposed and its thread locals become eligible for garbage collection, which is what is needed. The bump of the max-heap size for test workers is also reduced back to 2g (was changed in projectnessie#9433). Fixes projectnessie#9441
See #9441 for a complete description of the underlying problem. TL;DR is: `ThreadLocal`s from various 3rd party libraries leak into the single `Test worker` thread that runs all the tests, resulting in TL objects/suppliers from the various Quarkus test class loaders, eventually leading to nasty OOMs. This change updates the `MultiEnvTestEngine` by using the new `ThreadPerTestClassExecutionExecutorService` and also "assimilate" really all tests, even the non-multi-env tests, so that those also run on a thread per test-class. The logic to distinguish multi-env from non-multi-env tests via `MultiEnvExtensionRegistry.registerExtension()` via test discovery is not perfect (but good enough), it can add multi-env tests to the non-multi-env tests, so an additional check is needed there. Since each test class runs on "its own thread", the `ThreadLocal`s are registered on that thread. Once the test class finishes, the thread is disposed and its thread locals become eligible for garbage collection, which is what is needed. The bump of the max-heap size for test workers is also reduced back to 2g (was changed in #9433). Fixes #9441
Added a commit to #9433 to increase the max test-worker heap size to 3g. This makes the
:nessie-quarkus:test
task finish.There is a resource leak, but that's already present with earlier Quarkus versions, so I suspect something that's present in all our tests is causing the leak, #9440 is one cause, but not the only one.
This issue is to investigate, find the root cause, fix it and bring the max test-worker heap size back to 2g.
The text was updated successfully, but these errors were encountered: