Skip to content

Conversation

@sy-nico
Copy link
Contributor

@sy-nico sy-nico commented Dec 24, 2025

Issue Pull Request

Linear Issue

SY-3420

Description

This is part 1/2 of a larger effort to test and capture console profiling metrics. This set of changes provides the infrastructure for which we can create automated testing. Part 2 will consistent of the automated Macros and integration tests. So why build a whole dashboard and profiler for the console?

Current Problems

  • Playwright runs in browser, no access to CPU/GPU/Heap metrics
  • Requires building out console test infrastructure to utilize the devTools
    • We would also need to build out reporting mechanisms
  • No known way to automate actions on the console desktop app.
    • Even though the desktop app can access those metrics, there is no way to open devTools

Benefits:

Profiling

  • Real-time visibility into CPU, GPU, FPS, and heap usage without external tools
  • Automatic threshold-based issue detection with severity labels (nominal/warning/error)
  • Profiling data persisted to Synnax ranges for historical analysis and trend tracking

Testing

  • Macros enable automated, repeatable UI interactions for stress testing specific components
  • Significantly faster feedback loop than Playwright integration tests
  • Can integrate with Playwright tests for hybrid coverage (trigger profiling at test start)
  • Configurable iterations and delays for long-duration soak testing

Customer debugging tools

  • Customers can run profiling sessions and share the saved range with support (us)
    • Useful for dry runs before a big event
  • Range metadata captures system context (hostname, platform, OS version, console version, username)
    • Admin or chief engineer would have easy access to control room console information without a screen share

Architecture

┌───────────────────────────────────────────────────────────────────┐
│                          Dashboard.tsx                            │
│   (Main UI: Start/Stop/Pause, MetricSections, MacroPanel)         │
└───────────────────────────────────────────────────────────────────┘
                                 │
          ┌──────────────────────┼──────────────────────┐
          ▼                      ▼                      ▼
┌───────────────────┐  ┌────────────────────┐  ┌───────────────────┐
│    Collectors     │  │     Analyzers      │  │ useMacroExecution  │
│  CPU, GPU, FPS,   │  │  FPS, CPU, GPU,    │  │ (Executes scripted │
│  Heap, Network,   │  │  Heap (leak)       │  │  UI interactions)  │
│  LongTasks, Logs  │  │                    │  │     (dev only)     |
└───────────────────┘  └────────────────────┘  └───────────────────┘
          │                      │                      │
          └──────────┬───────────┘                      │
                     ▼                                  │
          ┌───────────────────┐                         │
          │  Report Compiler  │                         │
          │ (Verdict, Issues, │                         │
          │  MetricsReport)   │                         │
          └───────────────────┘                         │
                     │                                  │
                     ▼                                  ▼
┌───────────────────────────────────────────────────────────────────┐
│                     Redux Slice (perf/slice.ts)                   │
│   status, config, reports, macroResults, rangeKey                 │
└───────────────────────────────────────────────────────────────────┘
profiler.mp4
dashboard.mp4

Basic Readiness

  • I have performed a self-review of my code.
  • I have added relevant tests to cover the changes to CI.

Greptile Summary

Adds comprehensive performance profiling infrastructure to the console application, enabling real-time CPU, GPU, FPS, and heap monitoring with automated issue detection and Synnax range persistence.

Architecture:

  • Metrics Collection: Polling-based collectors for CPU/GPU (via Tauri), FPS (via requestAnimationFrame), heap (via performance.memory), network requests (Resource Timing API), long tasks (Long Task API), and console logs (intercepted)
  • Analysis Pipeline: Dedicated analyzers for each metric with threshold-based severity detection (warning/error) - FPS analyzer tracks drops, heap analyzer uses linear regression for leak detection, resource analyzers check peak and average usage
  • State Management: Redux slice with proper lifecycle (idle → running → paused → idle), batched report updates to minimize re-renders, transient data excluded from persistence
  • Range Integration: Creates Synnax ranges on profiling start with system metadata (hostname, OS, version), writes metrics every 5s, applies severity-based labels in real-time with latching behavior (peak-triggered labels are permanent, avg-triggered are transient)
  • Macro System: Automated UI interaction framework for stress testing (dev-only), includes lineplot and schematic macros with configurable iterations and delays

Key Design Patterns:

  • Circular dependency resolution via ref + stable wrapper pattern in Dashboard.tsx
  • O(1) memory running aggregates with warmup period handling in SampleBuffer
  • AbortController for cleanup to prevent memory leaks on unmount
  • Label replacement strategy where error supersedes warning
  • Event correlation window for associating long tasks with user interactions

Testing:
13 test files with comprehensive coverage including unit tests for analyzers, buffer, report compiler, selectors, slice reducers, and macro system

Benefits:

  • Real-time visibility without external devTools
  • Historical profiling data persisted to Synnax for trend analysis
  • Customer debugging tool (customers can share profiling ranges)
  • Faster feedback than Playwright tests for performance validation
  • Automatic threshold-based issue detection with severity labels

Confidence Score: 5/5

  • This PR is safe to merge with high confidence - well-architected infrastructure with comprehensive testing
  • Score reflects excellent code quality: clean separation of concerns, sophisticated patterns (circular dependency resolution, latching behavior), comprehensive test coverage (13 test files), proper cleanup (AbortController, console restoration), optimized performance (batched updates, O(1) aggregates), and thorough documentation. The feature is isolated (dev-only macros, transient state excluded from persistence), cross-platform (Rust implementations for CPU/GPU), and production-ready.
  • No files require special attention - the codebase demonstrates mature engineering practices throughout

Important Files Changed

Filename Overview
console/src/perf/slice.ts Redux slice for performance profiling state management - clean action creators with proper state transitions
console/src/perf/Dashboard.tsx Main dashboard component with clever circular dependency resolution using ref pattern for hooks
console/src/perf/hooks/useProfilingSession.ts Orchestrates profiling lifecycle with real-time analysis and label management, including latching behavior for peak metrics
console/src/perf/hooks/useProfilingRange.ts Manages Synnax range CRUD and metadata writes with proper error handling and AbortController for cleanup
console/src/perf/metrics/buffer.ts Efficient circular buffer with O(1) running aggregates and warmup period handling for FPS min calculation
console/src/perf/hooks/useCollectors.ts Manages collector lifecycle and setInterval loop, batches state updates to minimize re-renders
console/src/perf/analyzer/heap-analyzer.ts Linear regression-based memory leak detection comparing baseline vs recent samples with trend analysis
console/src-tauri/src/perf.rs Cross-platform system metrics collection via Tauri commands - CPU/memory via sysinfo, GPU via NVML/IOKit
console/src/perf/macros/runner.ts Executes macros in configurable loops with iteration control and timing metrics for automated UI interactions
console/src/perf/metrics/console.ts Console message interceptor with extensible design for future hybrid mode (count all, store warn/error only)

Sequence Diagram

sequenceDiagram
    participant User
    participant Dashboard
    participant useCollectors
    participant useProfilingSession
    participant useProfilingRange
    participant Analyzers
    participant Redux
    participant Synnax

    User->>Dashboard: Click Start
    Dashboard->>Redux: dispatch(Perf.start())
    Redux-->>Dashboard: status="running"
    
    Dashboard->>useCollectors: Initialize collectors
    useCollectors->>useCollectors: Create CPU/GPU/FPS/Heap/Network/LongTask/Console collectors
    useCollectors->>useCollectors: Start setInterval(1000ms)
    
    Dashboard->>useProfilingSession: Setup session
    useProfilingSession->>useProfilingRange: createRange(startValues)
    useProfilingRange->>Synnax: Create range with metadata
    Synnax-->>useProfilingRange: rangeKey
    useProfilingRange->>Redux: setRangeKey()
    
    loop Every 1 second (while running)
        useCollectors->>useCollectors: collectSample()
        useCollectors->>useCollectors: Push to SampleBuffer
        useCollectors->>useProfilingSession: handleSample(sample, buffer)
        useProfilingSession->>Analyzers: analyze(samples, aggregates)
        Analyzers-->>useProfilingSession: {leak, fps, cpu, gpu} reports
        useProfilingSession->>Redux: setReports({leak, fps, cpu, gpu})
        useProfilingSession->>useProfilingRange: addMetricLabel() [if threshold exceeded]
        useProfilingRange->>Synnax: Add warning/error label to range
    end
    
    loop Every 5 seconds (while running)
        useProfilingRange->>useProfilingRange: writeMetrics()
        useProfilingRange->>Synnax: Update range metadata (averages, peaks)
    end
    
    User->>Dashboard: Click Pause
    Dashboard->>Redux: dispatch(Perf.pause())
    Redux-->>Dashboard: status="paused"
    useProfilingSession->>useProfilingSession: captureFinal(lastSample)
    useProfilingSession->>useProfilingRange: updateEndTime()
    useProfilingRange->>Synnax: Update range timeRange
    
    User->>Dashboard: Click Reset
    Dashboard->>Redux: dispatch(Perf.reset())
    Redux-->>Dashboard: status="idle"
    useProfilingSession->>Analyzers: Final analysis
    Analyzers-->>useProfilingSession: Final severities
    useProfilingSession->>useProfilingRange: finalizeRange(severities, stopValues)
    useProfilingRange->>Synnax: Add final labels + metadata
    useProfilingSession->>useProfilingSession: Cleanup (reset buffers, collectors)
Loading

…nd tests. Current dahsboard is simply a draft.
…ysis code

- Use pre-allocated SampleBuffer with fixed 30-sample window comparison
- Handle Long Tasks API unavailability on WebKit platforms
- Extract shared runAnalysis helper and consolidate types in analyzer/types.ts
- Remove transient data from Redux slice, add reset() to collectors
- Move FPS degradation detection into DegradationDetector for consistency with LeakDetector
- Move CpuAnalyzer into cpu-analyzer for consistency with DegradationDetector and LeakDetector
…ture

- Add comprehensive tests for SampleBuffer, analyzers (heap, fps, cpu, gpu), and slice reducers (89 new tests)
- Extract ResourceAnalyzer and PollingCollector base classes to reduce code duplication
- Add versioned types infrastructure (types/v0.ts) for future schema migrations
- Add workflow registry pattern for extensible profiling workflows
- Rename framerate.ts → fps.ts for consistency
- Remove unnecessary range creation retry logic
@sy-nico sy-nico requested a review from pjdotson December 24, 2025 03:04
Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

96 files reviewed, 2 comments

Edit Code Review Agent Settings | Greptile

@sy-nico
Copy link
Contributor Author

sy-nico commented Dec 24, 2025

@greptile, comments have been addressed and 0 byte driver file has been restored to rc state. Update the score and table previously left in the PR description.

@emilbon99 emilbon99 requested review from emilbon99 and removed request for pjdotson December 29, 2025 16:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants