-
Notifications
You must be signed in to change notification settings - Fork 669
perf: full PGO pipeline - SPGO, CallFrequency layout, hot-cold splitting, cross-module inlining #10877
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft
benaadams
wants to merge
276
commits into
master
Choose a base branch
from
pgo-2
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
perf: full PGO pipeline - SPGO, CallFrequency layout, hot-cold splitting, cross-module inlining #10877
Changes from 1 commit
Commits
Show all changes
276 commits
Select commit
Hold shift + click to select a range
5a2b5db
fix: run dotnet-trace in foreground for clean trace finalization
benaadams 9d6d6f3
feat: use dotnet-trace collect-linux for real kernel CPU sampling
benaadams d603cb0
fix: use timed dotnet-trace collection with Nethermind as PID 1
benaadams 6ed47e6
fix: add DotNETRuntime provider to sampling trace for method resolution
benaadams 6a9e501
chore(pgo): update PGO profile
github-actions[bot] 59b95f0
fix: exclude sampling.nettrace from main trace selection
benaadams f1b6c44
fix: correct trace selection, reduce attach delay, extend sampling
benaadams 250c5c0
chore(pgo): update PGO profile
github-actions[bot] 3035994
feat: switch to collect-linux for kernel CPU sampling
benaadams d6bb089
chore(pgo): update PGO profile
github-actions[bot] 1c5884f
feat: start collect-linux before Nethermind for startup tracing
benaadams 7d7ec48
fix: build dotnet-pgo from main for collect-linux trace format
benaadams 08c509f
chore(pgo): update PGO profile
github-actions[bot] fe631db
feat: switch to perfcollect for CPU sampling
benaadams 2b8d241
Merge branch 'pgo-2' of https://github.com/NethermindEth/nethermind i…
benaadams bd297a7
chore(pgo): update PGO profile
github-actions[bot] f918cc6
fix: install perf only, use -nolttng with perfcollect
benaadams cbe5474
fix: install LTTng 2.13 and use perfcollect with full tracing
benaadams f2262d1
fix: simplify sampling to dotnet-trace with combined profiles
benaadams 0104f9f
Merge branch 'pgo-2' of https://github.com/NethermindEth/nethermind i…
benaadams 64bf468
chore(pgo): update PGO profile
github-actions[bot] 460c1d2
fix(pgo): use .nettrace extension for sampling trace instead of .zip
benaadams 76df31e
chore(pgo): update PGO profile
github-actions[bot] 5c8b24b
fix(pgo): add MethodDetails provider for sampling trace
benaadams e18e7aa
Merge branch 'pgo-2' of https://github.com/NethermindEth/nethermind i…
benaadams 895db2b
chore(pgo): update PGO profile
github-actions[bot] 75826bf
fix(pgo): always merge sampling .mibc when available
benaadams 098698e
chore(pgo): update PGO profile
github-actions[bot] 18e5316
fix(pgo): add --spgo flag to sampling trace conversion
benaadams e745f93
chore(pgo): update PGO profile
github-actions[bot] d6f44bc
fix(pgo): switch sampling to perfcollect on Ubuntu Focal for SPGO
benaadams 397c9bb
chore(pgo): update PGO profile
github-actions[bot] 380b9ba
fix(pgo): use python3 for safe YAML injection of --privileged flag
benaadams e4f5a57
Merge branch 'pgo-2' of https://github.com/NethermindEth/nethermind i…
benaadams 5b27abd
fix(pgo): fix YAML-breaking python3 injection of --privileged flag
benaadams 92c840c
fix(pgo): switch sampling image back to Noble for glibc compatibility
benaadams 01fdcd0
fix(pgo): add zip package, network cleanup, perfcollect crash output
benaadams c50286f
fix(db): ensure Environment.Exit on DB corruption even if marker writ…
benaadams 9302302
fix(pgo): show all perfcollect and Nethermind output on stdout
benaadams a148493
fix(pgo): set sampling to single iteration to prevent crash loops
benaadams 8a28c9c
fix(pgo): set main PGO collection to single iteration too
benaadams c185c8d
fix(pgo): revert amount/warmup overrides — amount is payload count no…
benaadams 412ceb5
fix(pgo): remove --privileged injection from sampling config
benaadams 9411de6
fix(pgo): clean up stale containers/networks before sampling step
benaadams ab3a84d
chore(pgo): update PGO profile
github-actions[bot] db704a7
fix(pgo): add missing unzip package for perfcollect
benaadams ffbb940
fix(pgo): fix perfcollect output buffering and PID tracking
benaadams 45c6184
chore(pgo): update PGO profile
github-actions[bot] 2f75caf
fix(pgo): run perfcollect on host instead of inside Docker container
benaadams 09cd264
Merge branch 'pgo-2' of https://github.com/NethermindEth/nethermind i…
benaadams f669157
Revert "fix(pgo): run perfcollect on host instead of inside Docker co…
benaadams 672d2c2
fix(pgo): use EXPB security_opt branch and add seccomp=unconfined
benaadams dea5b2b
chore(pgo): update PGO profile
github-actions[bot] cc71a45
fix(pgo): set perf_event_paranoid=-1 on host before sampling
benaadams e8113d4
Merge branch 'pgo-2' of https://github.com/NethermindEth/nethermind i…
benaadams 3fffa92
chore(pgo): update PGO profile
github-actions[bot] f05d124
fix(pgo): wait for perfcollect post-processing before container exit
benaadams 3cdeec6
fix(pgo): let perfcollect finish before container stop signal
benaadams cfabdd8
chore(pgo): update PGO profile
github-actions[bot] 0d0b1a9
fix(pgo): wait for LTTng session before starting Nethermind
benaadams 5aea5b7
chore(pgo): update PGO profile
github-actions[bot] e0bfc2e
feat(pgo): build libcoreclrtraceptprovider.so from source for LTTng
benaadams bbe1297
chore(pgo): simplify — remove unused cmake, deduplicate cleanup blocks
benaadams 4f00a04
chore(pgo): address review findings — signal trap, sparse checkout, A…
benaadams 35cd9a2
fix(pgo): add pal/inc to sparse checkout and include paths
benaadams 490387f
fix(pgo): drop sparse checkout, add minipal include path
benaadams 41ddcad
fix(pgo): use runtime build system for libcoreclrtraceptprovider.so
benaadams 2fe5c7a
fix(pgo): install full coreclr build prerequisites
benaadams 17dc8b5
fix(pgo): use stub PAL headers instead of building full coreclr
benaadams fedd8de
chore(pgo): update PGO profile
github-actions[bot] 2235fcd
fix(pgo): repack sampling.trace.zip to strip container path prefix
benaadams bf5ef5c
Merge branch 'pgo-2' of https://github.com/NethermindEth/nethermind i…
benaadams 204d8ff
chore(pgo): update PGO profile
github-actions[bot] 2a56a2e
fix(pgo): build coreclr with LTTng — swap both libcoreclr.so and prov…
benaadams 4490c07
fix(pgo): use build output path directly instead of /tmp
benaadams 688e55e
fix(pgo): pin lttng-build stage to linux/amd64 — skip arm build
benaadams f8cd6a5
fix(pgo): build Docker images for linux/amd64 only — skip arm64
benaadams 6c7ab25
fix: add platforms input to publish-docker workflow
benaadams f37e532
Revert "fix: add platforms input to publish-docker workflow"
benaadams 7152533
fix(pgo): remove platforms input — publish-docker.yml doesn't accept it
benaadams 8160e24
fix(pgo): clear stale pgo-data before extraction — self-hosted runner…
benaadams abf03a5
chore(pgo): update PGO profile
github-actions[bot] ae06985
fix(pgo): set DOTNET_LTTngConfig for MethodDiagnostic keyword
benaadams 789c988
Merge branch 'pgo-2' of https://github.com/NethermindEth/nethermind i…
benaadams 0340e38
chore(pgo): update PGO profile
github-actions[bot] e511856
fix(pgo): enable all DotNETRuntime LTTng tracepoints including Method…
benaadams 486e4bb
Merge branch 'pgo-2' of https://github.com/NethermindEth/nethermind i…
benaadams 7b3f336
fix(pgo): patch perfcollect to add MethodDetails tracepoint
benaadams 64652f9
chore(pgo): update PGO profile
github-actions[bot] 42067aa
Merge branch 'pgo-2' of https://github.com/NethermindEth/nethermind i…
benaadams 7adbdff
chore(pgo): update PGO profile
github-actions[bot] 8de45b3
fix(pgo): remove DOTNET_LTTngConfig — let runtime activate all keywords
benaadams 3bf82c8
Merge branch 'pgo-2' of https://github.com/NethermindEth/nethermind i…
benaadams 69e0fc0
chore(pgo): update PGO profile
github-actions[bot] b8d009a
fix(pgo): build TraceEvent from source with MethodDetails CTF mapping
benaadams d567d8c
fix(pgo): build patched TraceEvent as version 3.1.28 to match dotnet-…
benaadams 1adf4db
Merge branch 'pgo-2' of https://github.com/NethermindEth/nethermind i…
benaadams 7d74df5
fix(pgo): strip SupportFiles dependency from TraceEvent build
benaadams 9c94490
fix(pgo): correct CtfEventMapping args — (opcode, id, version) not (e…
benaadams 751a9d0
chore(pgo): update PGO profile
github-actions[bot] a5449c2
fix(pgo): use python regex to strip SupportFiles — sed broke XML stru…
benaadams 5154261
feat(pgo): add PgoTrim convert-trace to inject MethodDetails CTF mapping
benaadams d646204
Merge branch 'pgo-2' of https://github.com/NethermindEth/nethermind i…
benaadams 6ab8a0c
chore(pgo): update PGO profile
github-actions[bot] faa324d
fix(pgo): enable TypeKeyword in perfcollect defaults for BulkType events
benaadams 7e78e43
fix(pgo): add TypeKeyword to perfcollect defaults, add debug output f…
benaadams bfa5b47
Merge branch 'pgo-2' of https://github.com/NethermindEth/nethermind i…
benaadams 2b137b0
chore(pgo): update PGO profile
github-actions[bot] 0d341d3
fix(pgo): patch dotnet-pgo to use TraceLog() directly for .etlx input
benaadams d00a6dc
Merge branch 'pgo-2' of https://github.com/NethermindEth/nethermind i…
benaadams 22a9187
fix(pgo): add AssemblyLoad to perfcollect's LoaderKeyword array
benaadams a2ad970
chore(pgo): update PGO profile
github-actions[bot] 79fb6b4
fix(pgo): upgrade TraceEvent to 3.1.30 — .etlx format version 74 vs 7…
benaadams e467800
Merge branch 'pgo-2' of https://github.com/NethermindEth/nethermind i…
benaadams fba6865
chore(pgo): update PGO profile
github-actions[bot] e15b6cb
fix(pgo): force rebuild PgoTrim to pick up TraceEvent 3.1.30 on cache…
benaadams f84ee64
fix(pgo): move PgoTrim convert-trace to its own step before dotnet-pg…
benaadams 7df050a
fix(pgo): consolidate all PgoTrim work into one step before dotnet-pgo
benaadams 41fae7e
fix(pgo): use dotnet restore --force + build --no-incremental for Pgo…
benaadams 55c34da
Merge branch 'pgo-2' of https://github.com/NethermindEth/nethermind i…
benaadams 7e6f8c9
chore(pgo): update PGO profile
github-actions[bot] 41e9ecf
fix(pgo): clear NuGet caches and stale sources before PgoTrim restore
benaadams 1414f59
Merge branch 'pgo-2' of https://github.com/NethermindEth/nethermind i…
benaadams 7cbf976
fix(pgo): log TraceEvent assembly version in PgoTrim for debugging
benaadams 57a4139
chore(pgo): update PGO profile
github-actions[bot] 0f39912
fix(pgo): add debug output for dotnet-pgo TraceEvent version and .etl…
benaadams 7ef543a
fix(pgo): override dotnet-pgo TraceEvent to 3.1.30 to match PgoTrim
benaadams b879a95
Merge branch 'pgo-2' of https://github.com/NethermindEth/nethermind i…
benaadams a139d9e
chore(pgo): update PGO profile
github-actions[bot] ddceeba
feat(pgo): build dotnet-pgo with PerfView PR branch for full SPGO sup…
benaadams a2858a1
fix(pgo): add Dia2Lib and TraceReloggerLib compile refs for PerfView …
benaadams a038c07
fix(pgo): remove --no-build from PgoTrim — no prior build step exists
benaadams d629b12
chore(pgo): update PGO profile
github-actions[bot] 0a6f84b
feat(pgo): add SPGO perf sample extraction and dotnet-pgo injection
benaadams a54e733
chore(pgo): update PGO profile
github-actions[bot] f349088
fix(pgo): add JittedMethodILToNativeMap keyword for SPGO block attrib…
benaadams f52db26
Merge branch 'pgo-2' of https://github.com/NethermindEth/nethermind i…
benaadams 9461523
chore(pgo): update PGO profile
github-actions[bot] 853d6e3
fix(pgo): add CompilationDiagnostic keyword and ILToNativeMap diagnos…
benaadams 22c61a2
Merge branch 'pgo-2' of https://github.com/NethermindEth/nethermind i…
benaadams 4b8c587
chore(pgo): update PGO profile
github-actions[bot] 9a95c6e
fix(pgo): enable MethodILToNativeMap_V1 — runtime fires V1, not V0
benaadams 0828e46
Merge branch 'pgo-2' of https://github.com/NethermindEth/nethermind i…
benaadams 29d8c27
chore(pgo): update PGO profile
github-actions[bot] c117eef
fix(pgo): remove DOTNET_LTTngConfig — use ActivateAllKeywordsOfAllPro…
benaadams 229c5c0
debug(pgo): add CTF event dispatch counting and remove LTTngConfig fo…
benaadams 666ee6f
Merge branch 'pgo-2' of https://github.com/NethermindEth/nethermind i…
benaadams 5a86c1f
fix(pgo): move debug print inside using block — variable scope error
benaadams 82e1bb2
fix(pgo): pass KeepAllEvents=true to preserve ILToNativeMap in .etlx
benaadams 3493901
Undo unrelated change
benaadams 8789a38
done
benaadams 9f0b13d
chore(pgo): update PGO profile
github-actions[bot] dfc55c4
fix(pgo): catch FlowSmoothing crash so SPGO doesn't lose all 143K sam…
benaadams aa72704
whitespace
benaadams 5d062f5
chore(pgo): update PGO profile
github-actions[bot] 541b897
Merge branch 'pgo-2' of https://github.com/NethermindEth/nethermind i…
benaadams e1fd5a4
debug: add PGO diagnostic output and PublishReadyToRunShowWarnings to…
benaadams e1b4f82
fix(pgo): increase perfcollect sampling duration from 120s to 240s fo…
benaadams b905c78
chore(pgo): update PGO profile
github-actions[bot] 11bff70
fix(pgo): revert to 120s sampling — 240s exceeded container lifetime
benaadams b2d6d32
Merge branch 'pgo-2' of https://github.com/NethermindEth/nethermind i…
benaadams 359c976
fix(pgo): increase perfcollect to 150s — captures full block processi…
benaadams f016780
fix(pgo): lower --spgo-min-samples to 20 for broader method coverage
benaadams 7b0be72
fix(pgo): increase perf sampling frequency from 1000 Hz to 4000 Hz fo…
benaadams 17184b8
chore(pgo): update PGO profile
github-actions[bot] 861032e
chore(pgo): update PGO profile
github-actions[bot] 1d2528d
fix(pgo): revert to 120s/1000Hz — 4000Hz causes perfcollect to crash …
benaadams e6c56b9
fix(pgo): try 150s collection at 1000Hz — frequency was the crash cau…
benaadams 0565c4c
Merge branch 'pgo-2' of https://github.com/NethermindEth/nethermind i…
benaadams 34353cd
fix(pgo): revert to 120s — 150s collection + 30s zip exceeds containe…
benaadams b3f0ef9
chore(pgo): update PGO profile
github-actions[bot] 643911c
Merge branch 'pgo-2' of https://github.com/NethermindEth/nethermind i…
benaadams 6837301
chore(pgo): update PGO profile
github-actions[bot] 7453b7f
debug: fix PGO diagnostic path and add verbose publish to grep for mi…
benaadams 089e9a6
Add the pgo files to output
benaadams 424240d
fix(pgo): increase TC_CallCountingDelayMs from 0 to 30ms
benaadams 12dca67
fix(evm): use representative values in opcode warmup for better PGO p…
benaadams 43b8861
fix(evm): skip state-touching opcodes during warmup to prevent GDV po…
benaadams 28cf12a
fix: improve real-time Nethermind log tailing in benchmark workflow
benaadams f409842
debug: fix Directory.Build.targets path and broaden crossgen2 grep pa…
benaadams d22bfcb
fix: don't swallow dotnet publish errors — tee to log file and check …
benaadams 888a577
fix(pgo): clean up old Docker images to free disk space
benaadams a3aa240
fix(pgo): also clean up old EXPB infra images (Alloy, k6) during disk…
benaadams 95d64b4
fix(pgo): keep 10G buildx cache to preserve coreclr lttng-build layers
benaadams 28afdf7
fix(pgo): prune all buildx cache — old coreclr builds are wasting disk
benaadams 208f1fb
fix(pgo): move Docker cleanup to end of job, keep start minimal
benaadams a71ef16
fix(pgo): kill ALL expb containers at start — Alloy/k6 hold network r…
benaadams 1a41916
fix(pgo): respect EXPB lock file — abort if another run is active (<1…
benaadams e787205
fix(pgo): prevent container restart loops with sentinel file
benaadams e1ff56a
TEMPORARY: force kill all containers + remove lock to recover from st…
benaadams b45aef7
fix(pgo): keep 4G buildx cache to avoid rebuilding coreclr every run
benaadams 207cca4
feat(pgo): pre-build LTTng coreclr image — skip 10min rebuild every run
benaadams 5cd2fcb
fix(pgo): remove if:always() from processing steps — fail fast on errors
benaadams 2a4c997
fix(pgo): unmount stale overlay mounts at startup — overlay EBUSY blo…
benaadams 1336513
fix(pgo): lazy unmount + rm overlay work dirs to fully clear stale mo…
benaadams d0f35b8
fix(pgo): add timeout to docker rm in sampling cleanup — prevents han…
benaadams 6ce7946
fix(pgo): do full disk cleanup at start of job — out of disk causes i…
benaadams 48f9edf
fix(pgo): remove stale Docker volumes before EXPB runs
benaadams c356dd3
fix(pgo): render sampling config from base — don't depend on RUNNER_TEMP
benaadams 8e3e27d
chore(pgo): update PGO profile
github-actions[bot] af6f19f
fix(pgo): improve SPGO sampling — delay Tier-1, 15K blocks, 400s coll…
benaadams b4341d7
fix(pgo): move TC_CallCountingDelayMs into sampling Dockerfile
benaadams 8777832
fix(pgo): drop payloads-15000 swap — file may not exist on runner
benaadams 2de827f
fix(pgo): use extra_env not environment, send 10K blocks for sampling
benaadams 28de12a
fix(pgo): reduce COLLECTSEC to 300 — must finish before container stops
benaadams f65a4b7
chore(pgo): update PGO profile
github-actions[bot] 161a377
fix(pgo): increase sampling window — TC_delay=15min, collect=12min
benaadams a559735
fix(pgo): match main collection to sampling — TC_delay=15min, 10K blo…
benaadams 6d860f1
fix(pgo): remove unused COLLECTSEC from main collection, fix comments
benaadams da5a7bd
fix(pgo): remove TC delay from main collection — Tier-1 gives richer …
benaadams b7233b5
fix(pgo): reduce COLLECTSEC to 600 — 720 outlasted container lifetime
benaadams 35a0127
chore(pgo): update PGO profile
github-actions[bot] dbf1576
fix(pgo): reduce COLLECTSEC to 580 — more margin before container stops
benaadams cb643b8
Merge branch 'master' into pgo-2
benaadams 2dced5e
chore(pgo): remove debug scaffolding from Dockerfile and TraceConverter
benaadams 435c128
Merge branch 'pgo-2' of https://github.com/NethermindEth/nethermind i…
benaadams f5b641b
refactor(pgo): move PGO Dockerfiles into tools/PgoTrim
benaadams 57a4f0b
refactor(pgo): respect EXPB lock, clean only own resources
benaadams 661879e
fix(pgo): always exit if EXPB lock exists — use cleanup workflow to c…
benaadams ef66fb9
fix(pgo): remove unnecessary if:always() from upload and extract steps
benaadams 1113c64
fix(pgo): remove continue-on-error from EXPB step — fail fast if coll…
benaadams 17efce1
fix(pgo): fail update if .jit.gz artifact is missing
benaadams 03721e6
chore(pgo): reduce artifact retention — profiles are committed to repo
benaadams 158fa98
fix(pgo): fail if CPU sampling pass fails — SPGO is not optional
benaadams 3b301ad
chore(pgo): update PGO profile
github-actions[bot] f3c2c6d
Merge branch 'master' into pgo-2
benaadams 23080db
feat(pgo): add call graph extraction for Pettis-Hansen method layout
benaadams 93c1aa4
style(pgo): replace var with concrete types in PgoTrim
benaadams be7f008
fix(pgo): exclude dotnet-pgo-patches from PgoTrim build
benaadams 272b2f4
fix(pgo): SampleCorrelator is in root namespace, not SPGO sub-namespace
benaadams 5dece21
fix(pgo): nullable compat and remove out-of-scope params from LoadCal…
benaadams 868ef61
fix(pgo): use nullable annotations correctly for dotnet-pgo warnings-…
benaadams 1d37afb
fix(runner): time-box DB disposal to 15s during shutdown (TEMPORARY)
benaadams 3488207
chore(pgo): update PGO profile
github-actions[bot] 7906e26
Revert "fix(runner): time-box DB disposal to 15s during shutdown (TEM…
benaadams 9c21c85
revert(pgo): remove TC_CallCountingDelayMs=30 to isolate benchmark re…
benaadams 2af4ad9
feat(pgo): generate CallChainProfile for crossgen2 CallFrequency layout
benaadams 0f62b74
feat(pgo): enable hot-cold splitting in R2R using SPGO block counts
benaadams d5fbe09
chore(pgo): compress callchain JSON (1.9MB -> 222KB) and decompress a…
benaadams 587e3c5
fix(pgo): add targeted R2R verbose output to verify callchain/hot-col…
benaadams d5a6372
fix(pgo): use diagnostic verbosity to verify crossgen2 flags (TEMPORARY)
benaadams 408a56f
fix(pgo): decompress callchain JSON before dotnet publish, not in MSB…
benaadams 33590fc
fix: remove diagnostic verbosity from Dockerfile - was causing CI tim…
benaadams 5215830
fix(pgo): decompress callchain JSON before R2R publish in build-solut…
benaadams 321e2c3
fix(pgo): decompress callchain JSON before benchmark builds for R2R t…
benaadams 25b2d62
refactor(pgo): move R2R Composite + OptimizationPreference to Directo…
benaadams babcc89
fix(pgo): disable R2R for test projects - composite R2R was causing C…
benaadams 034afba
Merge branch 'master' into pgo-2
benaadams a68ed8a
Merge branch 'master' into pgo-2
benaadams 2ffe6b5
chore(pgo): update PGO profile
github-actions[bot] File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.