-
Notifications
You must be signed in to change notification settings - Fork 387
Description
dumpheap -stat on a 19 GB dump (248M objects) completes successfully in ~66 seconds, but produces zero output during the heap walk phase. On a large dump this feels indistinguishable from a hang -- the user sees no CPU activity in the console, no progress indication, and no way to know whether the command is working or stuck.
This came up during investigation of a C# Dev Kit memory leak. We initially tried dotnet-dump analyze interactively, sent dumpheap -stat, and waited over 10 minutes with apparently zero CPU or disk activity before giving up. We then wrote a custom ClrMD script using DataTarget.LoadDump() and heap.EnumerateObjects() with periodic progress output -- that completed in ~83 seconds and successfully analyzed the dump. It was only later, by attaching dotnet-stack to the dotnet-dump process and running non-interactively with file redirection (dotnet-dump analyze <dump> -c "dumpheap -stat" > output.txt), that we confirmed dotnet-dump was actually working fine the whole time.
What happens today
- User runs
dumpheap -staton a large dump - The heap walk takes 60+ seconds with no output at all
- Then 8,649 lines (1.46 MB) of results are written all at once
- In an interactive/piped terminal, the burst of output can also cause pipe buffer backpressure, further contributing to the appearance of a hang
Suggestion
Print periodic progress to stderr during heap enumeration, e.g.:
> dumpheap -stat
Scanning heap: 50,000,000 / ~248,000,000 objects (3.2 GB / 16.3 GB)...
Scanning heap: 100,000,000 / ~248,000,000 objects (6.5 GB / 16.3 GB)...
...
The total object count may not be known in advance, but total heap size is available from segment metadata and bytes scanned could serve as the progress metric. Writing to stderr avoids polluting stdout for piped/redirected scenarios.
Environment
dotnet-dump9.0.661903- 19.3 GB dump, .NET 11.0.26.10518, Server GC, 248M objects
- Windows, 64 GB RAM
Related issues
- dotnet-dump is incredibly slow in dumpheap scenario #1637 --
dumpheapwas rewritten for performance (closed, fixed in SOS v7). The perf is now good; this issue is about the remaining UX gap. - Single File module scan takes a very long time (symbol lookup) #3737 -- Similar "appears hung" UX problem caused by slow symbol loading with no progress feedback.