Skip to content

Performance regression with regions-based GC when LOH has many short-lived allocations #115411

Open
@Treit

Description

@Treit

The following repo has a demonstration program which can be used to reproduce the issue:

https://github.com/Treit/DeserializationLatencyIssue

.NET 9 results (5 minute run)

--------------------------------------------------
Total deserializations: 134338.
Deserializations per second: 446.61/s
Total slow deserialization: 76.
Min deserialization time: 0.28 ms.
Avg deserialization time: 2.10 ms.
Max deserialization time: 2531.10 ms.
Total GC events: 2501
GC Start events: 1251
GC End events: 1250
Total system memory: 63.44 GB
Average memory usage: 53.42 GB (84.21%)
Peak memory usage: 63.29 GB (99.76%)
--------------------------------------------------

.NET 9 with Segments-based (crlgc.dll) GC results (5 minute run)

--------------------------------------------------
Total deserializations: 240808.
Deserializations per second: 801.77/s
Total slow deserialization: 20.
Min deserialization time: 0.28 ms.
Avg deserialization time: 1.22 ms.
Max deserialization time: 5723.17 ms.
Total GC events: 762
GC Start events: 381
GC End events: 381
Total system memory: 63.44 GB
Average memory usage: 29.24 GB (46.09%)
Peak memory usage: 61.6 GB (97.10%)
--------------------------------------------------

Steps to repro (PowerShell):

.NET 9:

cd DeserializationStress
$env:DOTNET_GCName=""
dotnet run --configuration Release 300

.NET 9 with Sements-based (clrgc.dll) GC:

cd DeserializationStress
$env:DOTNET_GCName="clrgc.dll"
dotnet run --configuration Release 300

Additional background

The scenario in question is based on a real-world large-scale web service.

The web service requests result in a lot of Large Object Heap (LOH) allocations. About 50% of all allocations are strings, as the service does a lot of heavy JSON processing and produces large JSON responses to send back to the caller (sometimes megabytes in size).

Issue after moving from .NET 6 to .NET 9.

The service showed a decrease in availability (measured by number of failed requests) due to requests timing out, shortly after moving to .NET 9. Requests are required to complete in a few seconds, so any introduced latency can cause timeouts to become more frequent, which was the case here.

In particular, it was observed that JSON serialization and deserialization operations that should normally complete in tens of milliseconds at most were increasingly taking multiple seconds, failing the call.

The results of the stress test program intended to simulate this issue (see above) show the difference in performance between the old GC implementation and the new GC implementation pretty clearly.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions