Skip to content

Add performance tests for the rules using BenchmarkDotNet #214

@joeldickson

Description

@joeldickson

Problem Statement

Currently, we lack visibility into the performance characteristics of our Roslyn analyzers. Without performance benchmarks, we risk introducing performance regressions that could significantly impact developer experience in IDEs and CI/CD pipelines. We need a comprehensive performance testing harness to:

  • Measure analyzer performance under different conditions
  • Detect performance regressions in CI
  • Identify optimization opportunities
  • Ensure analyzers scale well with large codebases

Proposed Solution

Overview

Implement a BenchmarkDotNet-based performance testing suite that measures analyzer performance across various scenarios and code complexity levels.

Detailed Plan

Phase 1: Infrastructure Setup

1.1 Create Performance Test Project Structure

Agoda.Analyzers.PerformanceTests/
├── Agoda.Analyzers.PerformanceTests.csproj
├── Benchmarks/
│   ├── IndividualRuleBenchmarks.cs
│   ├── FullAnalyzerSuiteBenchmarks.cs
│   └── ScalabilityBenchmarks.cs
├── TestData/
│   ├── NoViolations/
│   │   ├── LargeCodebase/
│   │   ├── MediumCodebase/
│   │   └── SmallCodebase/
│   ├── WithViolations/
│   │   ├── AllRulesViolated/
│   │   ├── HighViolationDensity/
│   │   └── LowViolationDensity/
│   └── EdgeCases/
│       ├── DeepInheritanceHierarchy/
│       ├── LargeFiles/
│       └── ComplexExpressions/
├── Utilities/
│   ├── CodeGenerator.cs
│   ├── TestDataManager.cs
│   └── ReportGenerator.cs
└── Scripts/
    ├── generate-test-data.ps1
    └── run-benchmarks.ps1

1.2 Dependencies and Configuration

  • Add BenchmarkDotNet package
  • Add Microsoft.CodeAnalysis.Testing packages
  • Configure benchmark attributes and parameters
  • Set up memory and CPU profiling

Phase 2: Test Data Generation

2.1 No Violations Test Projects
Create synthetic codebases that don't trigger any analyzer rules:

  • Small: ~50 files, 5K LOC - Basic classes, methods, properties
  • Medium: ~200 files, 25K LOC - Realistic project structure with dependencies
  • Large: ~1000 files, 100K LOC - Enterprise-scale codebase simulation

2.2 With Violations Test Projects
Create codebases that intentionally trigger rules:

  • All Rules Violated: Each rule triggered at least once per file
  • High Density: Multiple violations per file (stress test)
  • Low Density: Sparse violations (realistic scenario)

2.3 Edge Case Scenarios

  • Files with 10K+ lines of code
  • Deep inheritance hierarchies (20+ levels)
  • Complex LINQ expressions and nested lambdas
  • Large number of using statements and namespace nesting
  • Auto-generated code patterns

2.4 Code Generation Utility

public class CodeGenerator
{
    public void GenerateNoViolationCodebase(int fileCount, int averageLinesPerFile);
    public void GenerateViolationCodebase(RuleSet rules, ViolationDensity density);
    public void GenerateEdgeCaseScenarios();
}

Phase 3: Benchmark Implementation

3.1 Individual Rule Benchmarks

[MemoryDiagnoser]
[SimpleJob(RuntimeMoniker.Net60)]
[SimpleJob(RuntimeMoniker.Net80)]
public class IndividualRuleBenchmarks
{
    [Params("Small", "Medium", "Large")]
    public string CodebaseSize { get; set; }

    [Params("NoViolations", "WithViolations")]
    public string ViolationType { get; set; }

    [Benchmark]
    [ArgumentsSource(nameof(GetAllRules))]
    public async Task BenchmarkRule(string ruleName)
    {
        // Benchmark individual rule performance
    }
}

3.2 Full Analyzer Suite Benchmarks

  • Measure complete analyzer suite performance
  • Test with/without violations
  • Different codebase sizes
  • Memory usage tracking

3.3 Scalability Benchmarks

  • Linear scaling tests (1x, 2x, 4x, 8x code size)
  • File count vs file size impact analysis
  • Concurrent analysis performance

Phase 4: Performance Metrics and Reporting

4.1 Key Metrics to Capture

  • Execution Time: Mean, median, P95, P99
  • Memory Usage: Peak memory, GC pressure
  • Throughput: Files analyzed per second
  • Scaling Characteristics: Performance vs codebase size

4.2 Baseline and Regression Detection

  • Store baseline performance metrics
  • Automated comparison with previous runs
  • Configurable performance regression thresholds
  • Performance trend analysis

4.3 Reporting Format

  • HTML reports with charts and graphs
  • JSON output for CI integration
  • Markdown summaries for PR comments
  • Historical performance tracking

Phase 5: GitHub Actions Integration

5.1 Performance Test Workflow

name: Performance Tests

on:
  push:
    branches: [ main, develop ]
  pull_request:
    branches: [ main ]
  schedule:
    # Run nightly performance tests
    - cron: '0 2 * * *'

jobs:
  performance-tests:
    runs-on: ubuntu-latest
    timeout-minutes: 30
    
    steps:
    - uses: actions/checkout@v4
    
    - name: Setup .NET
      uses: actions/setup-dotnet@v4
      with:
        dotnet-version: |
          6.0.x
          8.0.x
    
    - name: Restore dependencies
      run: dotnet restore
    
    - name: Generate test data
      run: ./Scripts/generate-test-data.ps1
    
    - name: Run benchmarks
      run: dotnet run -c Release --project Agoda.Analyzers.PerformanceTests
    
    - name: Upload benchmark results
      uses: actions/upload-artifact@v4
      with:
        name: benchmark-results
        path: BenchmarkDotNet.Artifacts/
    
    - name: Performance regression check
      run: ./Scripts/check-performance-regression.ps1
    
    - name: Comment PR with results
      if: github.event_name == 'pull_request'
      uses: actions/github-script@v7
      with:
        script: |
          // Post performance summary to PR

5.2 Performance Monitoring Workflow

name: Performance Monitoring

on:
  schedule:
    - cron: '0 6 * * 1' # Weekly detailed analysis

jobs:
  detailed-performance-analysis:
    runs-on: ubuntu-latest
    
    steps:
    - name: Extended benchmark suite
      run: # Run comprehensive benchmarks
    
    - name: Generate performance report
      run: # Create detailed HTML report
    
    - name: Update performance dashboard
      run: # Update GitHub Pages dashboard

Phase 6: CI/CD Integration Details

6.1 Performance Gates

  • Fail CI if performance degrades by >20%
  • Warning if performance degrades by >10%
  • Different thresholds for different rule categories

6.2 Conditional Execution

- name: Check if performance tests needed
  id: check-changes
  run: |
    # Only run expensive tests if analyzer code changed
    if git diff --name-only HEAD~1 | grep -E "(Analyzers/|Rules/)"; then
      echo "run-perf-tests=true" >> $GITHUB_OUTPUT
    fi

- name: Run performance tests
  if: steps.check-changes.outputs.run-perf-tests == 'true'
  run: dotnet run -c Release --project Agoda.Analyzers.PerformanceTests

6.3 Results Storage and Tracking

  • Store results in GitHub artifacts
  • Optional: Store in external system (Azure Storage, etc.)
  • Performance history tracking
  • Automated performance regression alerts

Implementation Checklist

Infrastructure

  • Create performance test project
  • Add BenchmarkDotNet dependencies
  • Set up project structure
  • Configure benchmark parameters

Test Data

  • Implement code generation utilities
  • Generate no-violation test codebases
  • Generate with-violation test codebases
  • Create edge case scenarios
  • Validate test data quality

Benchmarks

  • Implement individual rule benchmarks
  • Implement full suite benchmarks
  • Implement scalability benchmarks
  • Add memory profiling
  • Configure multiple runtime targets

Reporting

  • Implement baseline storage
  • Create performance regression detection
  • Generate HTML reports
  • Create markdown summaries

CI/CD

  • Create performance test workflow
  • Implement conditional execution
  • Add performance gates
  • Set up artifact storage
  • Configure PR commenting

Documentation

  • Add README for performance tests
  • Document benchmark methodology
  • Create troubleshooting guide
  • Add performance optimization guidelines

Performance Optimization Focus Areas

Based on common Roslyn analyzer performance issues, pay special attention to:

  1. Node Filtering: Ensure analyzers only register for relevant syntax nodes
  2. Symbol Resolution: Minimize expensive symbol lookups
  3. Tree Traversal: Avoid unnecessary full tree walks
  4. Memory Allocations: Minimize allocations in hot paths
  5. Diagnostic Creation: Efficient diagnostic reporting

Success Criteria

  • All existing rules have performance baselines established
  • Performance regression detection is working in CI
  • Performance improves or remains stable for new rule additions
  • Documentation and tooling enable easy performance analysis
  • Team can identify and fix performance issues quickly

Timeline

  • Week 1-2: Infrastructure setup and project structure
  • Week 3-4: Test data generation and validation
  • Week 5-6: Benchmark implementation
  • Week 7: GitHub Actions integration
  • Week 8: Documentation and refinement

Future Enhancements

  • Integration with continuous performance monitoring
  • Performance comparison across .NET versions
  • Automated performance optimization suggestions
  • Integration with IDE performance profiling tools

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions