.NET 8.0.10 vs 9.0.0 RC2 GC Server Performance Regression in Sep (CSV Parser) Benchmark (due to DATAS default) #109047

nietras · 2024-10-19T10:09:01Z

In https://github.com/nietras/Sep (a fast highly optimized CSV parser) I have been comparing performance comparison-bench.ps1 between .NET 8 and .NET 9 RC2 and have observed what appears to be consistent and significant performance regression when using ServerGarbageCollection (true). The benchmark in question is also discussed in https://www.joelverhagen.com/blog/2020/12/fastest-net-csv-parsers

Benchmarks can be run by cloning the Sep repo, checking out branch net9.0 and running the command in the comparison-bench.ps1 perhaps adding --filter *GcServer*Sep* or similar. Details for benchmark, machine are given below via BenchmarkDotNet.

As can be seen this shows regression in a scenario of many medium size object allocations ranging from 500ms/429ms = 1.17x (single thread) to 174ms/102ms = 1.69x (multi-threaded) regression.

I know there have been changes to the GC my question is whether this regression is expected? And just wanted to flag it if it has any interest.

BenchmarkDotNet v0.14.0, Windows 10 (10.0.19044.3086/21H2/November2021Update)
AMD Ryzen 9 5950X, 1 CPU, 32 logical and 16 physical cores
.NET SDK 9.0.100-rc.2.24474.11
  [Host]     : .NET 8.0.10 (8.0.1024.46610), X64 RyuJIT AVX2
  Job-YVJTZC : .NET 8.0.10 (8.0.1024.46610), X64 RyuJIT AVX2
  Job-ZDJCYM : .NET 9.0.0 (9.0.24.47305), X64 RyuJIT AVX2

Server=True  InvocationCount=Default  IterationTime=350ms  
MaxIterationCount=15  MinIterationCount=5  WarmupCount=6  
Quotes=False  Reader=String

Method	Runtime	Scope	Rows	Mean	Ratio	MB	MB/s	ns/row	Allocated	Alloc Ratio
Sep______	.NET 8.0	Asset	50000	21.402 ms	1.00	29	1363.5	428.0	14133102 B	1.00
Sep_MT___	.NET 8.0	Asset	50000	5.576 ms	0.26	29	5233.7	111.5	14308501 B	1.01
Sep______	.NET 9.0	Asset	50000	24.444 ms	1.14	29	1193.8	488.9	14133077 B	1.00
Sep_MT___	.NET 9.0	Asset	50000	8.965 ms	0.42	29	3255.0	179.3	14310332 B	1.01

Sep______	.NET 8.0	Asset	1000000	429.654 ms	1.00	583	1358.7	429.7	273063216 B	1.00
Sep_MT___	.NET 8.0	Asset	1000000	102.979 ms	0.24	583	5668.9	103.0	274049328 B	1.00
Sep______	.NET 9.0	Asset	1000000	500.250 ms	1.16	583	1167.0	500.3	273062592 B	1.00
Sep_MT___	.NET 9.0	Asset	1000000	174.802 ms	0.41	583	3339.7	174.8	273973628 B	1.00

The text was updated successfully, but these errors were encountered:

EgorBo · 2024-10-19T12:29:59Z

Try with DATAS disabled e.g. <GarbageCollectionAdaptationMode>0</GarbageCollectionAdaptationMode>

stephentoub · 2024-10-19T12:47:31Z

cc: @mangod9, @Maoni0

nietras · 2024-10-19T12:50:13Z

Command I run from branch net9.0

dotnet run -c Release -f net8.0 --project src/Sep.ComparisonBenchmarks/Sep.ComparisonBenchmarks.csproj -- -m --warmupCount 6 --minIterationCount 5 --maxIterationCount 15 --runtimes net80 net90 --iterationTime 350 --hide Type Quotes Reader RatioSD Gen0 Gen1 Gen2 Error Median StdDev --filter *GcServerLongAsset*Sep*

No change with <GarbageCollectionAdaptationMode>0</GarbageCollectionAdaptationMode> but can't remember if BDN actually forward this to sub-processes? Is there a flag to tell BDN to use this like Server=True?

Server=True  InvocationCount=Default  IterationTime=350ms
MaxIterationCount=15  MinIterationCount=5  WarmupCount=6
Quotes=False  Reader=String

| Method    | Runtime  | Scope | Rows    | Mean     | Ratio | MB  | MB/s   | ns/row | Allocated | Alloc Ratio |
|---------- |--------- |------ |-------- |---------:|------:|----:|-------:|-------:|----------:|------------:|
| Sep______ | .NET 8.0 | Asset | 1000000 | 431.7 ms |  1.00 | 583 | 1352.1 |  431.7 | 260.41 MB |        1.00 |
| Sep_MT___ | .NET 8.0 | Asset | 1000000 | 111.1 ms |  0.26 | 583 | 5252.6 |  111.1 |  261.2 MB |        1.00 |
| Sep______ | .NET 9.0 | Asset | 1000000 | 500.7 ms |  1.16 | 583 | 1165.9 |  500.7 | 260.42 MB |        1.00 |
| Sep_MT___ | .NET 9.0 | Asset | 1000000 | 178.4 ms |  0.41 | 583 | 3272.0 |  178.4 | 261.32 MB |        1.00 |

nietras · 2024-10-19T13:27:49Z

Yes, it's DATAS. Tried settings it with environment variable e.g. for BDN with --envVars DOTNET_GCDynamicAdaptationMode:0 and tried running with 0 and 1 as can be seen below. This means "regression" is solely due to DATAS being default and otherwise no difference

NO DATAS

dotnet run -c Release -f net8.0 --project src/Sep.ComparisonBenchmarks/Sep.ComparisonBenchmarks.csproj -- -m --warmupCount 6 --minIterationCount 5 --maxIterationCount 15 --runtimes net80 net90 --iterationTime 350 --hide Type Quotes Reader RatioSD Gen0 Gen1 Gen2 Error Median StdDev --filter *GcServerLongAsset*Sep* --envVars DOTNET_GCDynamicAdaptationMode:0

BenchmarkDotNet v0.14.0, Windows 10 (10.0.19044.3086/21H2/November2021Update)
AMD Ryzen 9 5950X, 1 CPU, 32 logical and 16 physical cores
.NET SDK 9.0.100-rc.2.24474.11
  [Host]     : .NET 8.0.10 (8.0.1024.46610), X64 RyuJIT AVX2
  Job-KKDGWQ : .NET 8.0.10 (8.0.1024.46610), X64 RyuJIT AVX2
  Job-HUTQEJ : .NET 9.0.0 (9.0.24.47305), X64 RyuJIT AVX2

EnvironmentVariables=DOTNET_GCDynamicAdaptationMode=0  Server=True  InvocationCount=Default
IterationTime=350ms  MaxIterationCount=15  MinIterationCount=5
WarmupCount=6  Quotes=False  Reader=String

| Method    | Runtime  | Scope | Rows    | Mean     | Ratio | MB  | MB/s   | ns/row | Allocated | Alloc Ratio |
|---------- |--------- |------ |-------- |---------:|------:|----:|-------:|-------:|----------:|------------:|
| Sep______ | .NET 8.0 | Asset | 1000000 | 452.7 ms |  1.00 | 583 | 1289.6 |  452.7 | 260.41 MB |        1.00 |
| Sep_MT___ | .NET 8.0 | Asset | 1000000 | 112.4 ms |  0.25 | 583 | 5195.4 |  112.4 | 261.51 MB |        1.00 |
| Sep______ | .NET 9.0 | Asset | 1000000 | 445.3 ms |  0.98 | 583 | 1310.9 |  445.3 | 260.41 MB |        1.00 |
| Sep_MT___ | .NET 9.0 | Asset | 1000000 | 117.8 ms |  0.26 | 583 | 4954.0 |  117.8 | 261.38 MB |        1.00 |

DATAS

dotnet run -c Release -f net8.0 --project src/Sep.ComparisonBenchmarks/Sep.ComparisonBenchmarks.csproj -- -m --warmupCount 6 --minIterationCount 5 --maxIterationCount 15 --runtimes net80 net90 --iterationTime 350 --hide Type Quotes Reader RatioSD Gen0 Gen1 Gen2 Error Median StdDev --filter *GcServerLongAsset*Sep* --envVars DOTNET_GCDynamicAdaptationMode:1

BenchmarkDotNet v0.14.0, Windows 10 (10.0.19044.3086/21H2/November2021Update)
AMD Ryzen 9 5950X, 1 CPU, 32 logical and 16 physical cores
.NET SDK 9.0.100-rc.2.24474.11
  [Host]     : .NET 8.0.10 (8.0.1024.46610), X64 RyuJIT AVX2
  Job-ZORNME : .NET 8.0.10 (8.0.1024.46610), X64 RyuJIT AVX2
  Job-BHTHZN : .NET 9.0.0 (9.0.24.47305), X64 RyuJIT AVX2

EnvironmentVariables=DOTNET_GCDynamicAdaptationMode=1  Server=True  InvocationCount=Default
IterationTime=350ms  MaxIterationCount=15  MinIterationCount=5
WarmupCount=6  Quotes=False  Reader=String

| Method    | Runtime  | Scope | Rows    | Mean     | Ratio | MB  | MB/s   | ns/row | Allocated | Alloc Ratio |
|---------- |--------- |------ |-------- |---------:|------:|----:|-------:|-------:|----------:|------------:|
| Sep______ | .NET 8.0 | Asset | 1000000 | 527.5 ms |  1.00 | 583 | 1106.6 |  527.5 | 260.41 MB |        1.00 |
| Sep_MT___ | .NET 8.0 | Asset | 1000000 | 170.0 ms |  0.32 | 583 | 3433.5 |  170.0 | 261.41 MB |        1.00 |
| Sep______ | .NET 9.0 | Asset | 1000000 | 528.2 ms |  1.00 | 583 | 1105.2 |  528.2 | 260.41 MB |        1.00 |
| Sep_MT___ | .NET 9.0 | Asset | 1000000 | 182.9 ms |  0.35 | 583 | 3192.2 |  182.9 | 261.17 MB |        1.00 |

mangod9 · 2024-10-19T15:28:36Z

yeah a throughput regression for certain microbenchmark scenarios is expected with DATAS. Assume the benchmark shows improved working set utilization?

hez2010 · 2024-10-19T18:23:40Z

It is expected in .NET 9.

In general, DATAS should benefit real-world applications a lot as it can largely reduce the working set and also improve GC latency, though it comes with a minor throughput penalty.

In another similar issue (#101006) I did a binary-tree allocation benchmark and got the following benchmark result on .NET 9 rc2:

Considering the large improvements to latency and working set, I would take the minor throughput perf regression.

nietras added the tenet-performance Performance related issue label Oct 19, 2024

dotnet-issue-labeler bot added the needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners label Oct 19, 2024

dotnet-policy-service bot added the untriaged New issue has not been triaged by the area owner label Oct 19, 2024

nietras changed the title ~~.NET 8.0.10 vs 9.0.0 RC2 GC Server Performance Regression in Sep (CSV Parser) Benchmark~~ .NET 8.0.10 vs 9.0.0 RC2 GC Server Performance Regression in Sep (CSV Parser) Benchmark (due to DATAS default) Oct 19, 2024

vcsjones added area-GC-coreclr and removed needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners labels Oct 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.NET 8.0.10 vs 9.0.0 RC2 GC Server Performance Regression in Sep (CSV Parser) Benchmark (due to DATAS default) #109047

.NET 8.0.10 vs 9.0.0 RC2 GC Server Performance Regression in Sep (CSV Parser) Benchmark (due to DATAS default) #109047

nietras commented Oct 19, 2024

EgorBo commented Oct 19, 2024

stephentoub commented Oct 19, 2024

nietras commented Oct 19, 2024 •

edited

Loading

nietras commented Oct 19, 2024 •

edited

Loading

mangod9 commented Oct 19, 2024

hez2010 commented Oct 19, 2024 •

edited

Loading

.NET 8.0.10 vs 9.0.0 RC2 GC Server Performance Regression in Sep (CSV Parser) Benchmark (due to DATAS default) #109047

.NET 8.0.10 vs 9.0.0 RC2 GC Server Performance Regression in Sep (CSV Parser) Benchmark (due to DATAS default) #109047

Comments

nietras commented Oct 19, 2024

EgorBo commented Oct 19, 2024

stephentoub commented Oct 19, 2024

nietras commented Oct 19, 2024 • edited Loading

nietras commented Oct 19, 2024 • edited Loading

mangod9 commented Oct 19, 2024

hez2010 commented Oct 19, 2024 • edited Loading

nietras commented Oct 19, 2024 •

edited

Loading

nietras commented Oct 19, 2024 •

edited

Loading

hez2010 commented Oct 19, 2024 •

edited

Loading