Performance diagnosis handoff (panorama/profile lag)
This issue captures the completed diagnosis work so implementation can continue without re-running all profiling from scratch.
Current context
- Staging includes recent performance instrumentation and mitigation work.
- Workerized
coverage.compute path has been added with fallback.
- Pan interaction guards + overscan/fast-full staging were implemented; a pan-trigger regression was found and fixed.
What was diagnosed
1) Lag is not primarily in panorama input handlers
From perf debug logs (debug-logs/Console.txt, panning*.txt, sliders*.txt):
panorama.handler.* and mapview.panoramaInteraction.* are typically very small (mostly sub-ms/low-ms).
panorama.effect.interactionDispatch is also small relative to total stalls.
Conclusion: pointer/hover handlers are not the core bottleneck.
2) Main cost is simulation + overlay pipeline
Repeatedly observed high costs in:
coverage.compute (simulation grid compute)
overlay.coverage.build.* / overlay.coverage.total.* (raster build/encode)
Examples seen across runs:
- panning cases had
coverageComputeMs spikes into multi-second range (e.g. ~4-5s in worst captures).
- overlay build totals for passfail/relay can also be very large in some captures.
Conclusion: expensive compute/raster work dominates perceived lag.
3) Some captures were polluted by startup/background triggers
Even after attempting settle-first tests, many traces still include non-interaction triggers:
coverage.trigger.preset
coverage.trigger.terrain
coverage.trigger.selection
coverage.trigger.library-sync
Conclusion: not all collected traces were pure profile-only sessions; some include ongoing background/global recomputes.
4) Trigger attribution quality is now good
coverage.trigger.unknown was reduced to 0 in clean-tag captures.
- This means trigger provenance is now actionable.
5) Regression that was introduced and fixed
A pan-safezone implementation briefly caused recompute storms:
- false
isMapInteracting due to non-user move events
- excessive
pan-fast triggers
- stuck sidebar state (
Preparing simulation bounds...) when run skipped as unchanged
Fixes applied:
- only user-driven moves set map-interacting
- throttled pan-fast triggering
- pan-settle only after real interaction
- coverage store clears pending UI state on skip-same-signature path
Where problems are
- Global simulation path (
coverage.compute) still contributes significant latency under interaction.
- Coverage overlay raster builds (
overlay.coverage.build/total) still heavy in passfail/relay views.
- Profile smoothness is degraded when global recompute work overlaps profile interactions.
Where problems are not
- Not primarily in profile pointermove/hover handlers.
- Not primarily in panorama event dispatch plumbing.
Implemented groundwork already in place
- Perf telemetry buckets for triggers/stages/drops.
- Overscan/safezone and fast/full stage infrastructure in map overlay path.
- Worker infrastructure for coverage compute (
coverageWorker + client integration + fallback).
Recommended next implementation steps (in order)
-
Enforce strict profile-only interaction boundary:
- while profile dragging/panning is active, suppress/defer non-essential global coverage recomputes.
- settle-trigger authoritative recompute once interaction stops.
-
Validate worker path is active in staging under real user flow:
- verify
coverage.compute no longer blocks main thread responsiveness.
- add explicit telemetry flag/counter for worker vs fallback execution.
-
Reduce overlay build pressure in interaction mode:
- maintain fast-stage rasterization during active interaction.
- ensure passfail/relay heavy paths are not recomputed more often than needed.
-
Re-profile with strict protocol:
- wait fully idle
- clear log
- perform profile-only interaction
- avoid map/selection/env changes during capture
Acceptance target for this issue
- Profile panning/slider interactions feel smooth after initial load settles.
- No repeated storm-style recompute triggering during idle/non-map interaction.
- Telemetry clearly shows reduced overlap between profile interaction and heavy global recompute work.
Performance diagnosis handoff (panorama/profile lag)
This issue captures the completed diagnosis work so implementation can continue without re-running all profiling from scratch.
Current context
coverage.computepath has been added with fallback.What was diagnosed
1) Lag is not primarily in panorama input handlers
From perf debug logs (
debug-logs/Console.txt,panning*.txt,sliders*.txt):panorama.handler.*andmapview.panoramaInteraction.*are typically very small (mostly sub-ms/low-ms).panorama.effect.interactionDispatchis also small relative to total stalls.Conclusion: pointer/hover handlers are not the core bottleneck.
2) Main cost is simulation + overlay pipeline
Repeatedly observed high costs in:
coverage.compute(simulation grid compute)overlay.coverage.build.*/overlay.coverage.total.*(raster build/encode)Examples seen across runs:
coverageComputeMsspikes into multi-second range (e.g. ~4-5s in worst captures).Conclusion: expensive compute/raster work dominates perceived lag.
3) Some captures were polluted by startup/background triggers
Even after attempting settle-first tests, many traces still include non-interaction triggers:
coverage.trigger.presetcoverage.trigger.terraincoverage.trigger.selectioncoverage.trigger.library-syncConclusion: not all collected traces were pure profile-only sessions; some include ongoing background/global recomputes.
4) Trigger attribution quality is now good
coverage.trigger.unknownwas reduced to 0 in clean-tag captures.5) Regression that was introduced and fixed
A pan-safezone implementation briefly caused recompute storms:
isMapInteractingdue to non-user move eventspan-fasttriggersPreparing simulation bounds...) when run skipped as unchangedFixes applied:
Where problems are
coverage.compute) still contributes significant latency under interaction.overlay.coverage.build/total) still heavy in passfail/relay views.Where problems are not
Implemented groundwork already in place
coverageWorker+ client integration + fallback).Recommended next implementation steps (in order)
Enforce strict profile-only interaction boundary:
Validate worker path is active in staging under real user flow:
coverage.computeno longer blocks main thread responsiveness.Reduce overlay build pressure in interaction mode:
Re-profile with strict protocol:
Acceptance target for this issue