Skip to content

⚡ Fix benchmark_entities fixture scope for faster benchmark runs #171

Description

@sodre

Summary

The benchmark_entities fixture in tests/benchmark/conftest.py creates 100 entities on every test that uses it. Changing to scope="module" would provide 30-50% faster benchmark runs.

Problem

Current Implementation

# tests/benchmark/conftest.py:182-207
@pytest.fixture
def benchmark_entities(sync_limiter: Any) -> list[str]:
    """Pre-create entities with pre-warmed buckets for throughput tests."""
    entity_ids = [f"bench-entity-{i:03d}" for i in range(100)]
    limits = [Limit.per_minute("rpm", 1_000_000)]

    for entity_id in entity_ids:
        # Create entity
        sync_limiter.create_entity(entity_id, name=f"Benchmark Entity {entity_id}")
        # Pre-warm bucket by doing one acquire
        with sync_limiter.acquire(...):
            pass

    return entity_ids

Issues

  1. Function-scoped (default): Creates 100 entities + 100 acquires per test
  2. Repeated setup overhead: If 5 tests use this fixture, setup runs 5 times
  3. Slow benchmarks: Each setup adds ~1-2 seconds (moto) or more (LocalStack)

Impact Analysis

Looking at benchmark tests that use benchmark_entities:

  • test_throughput.py - Multiple tests use pre-warmed entities
  • test_capacity.py - Capacity counting tests

With 5+ tests using this fixture, we're wasting 5-10 seconds of repeated setup.

Proposed Solution

Change fixture scope to module (shared across tests in same file):

@pytest.fixture(scope="module")
def benchmark_entities(sync_limiter: Any) -> list[str]:
    """Pre-create entities with pre-warmed buckets for throughput tests.
    
    Module-scoped to avoid repeated setup across benchmark tests.
    Entities are created once per test file.
    """
    ...

Why module and not session?

  • session scope would share across all benchmark files
  • Different files may need different entity configurations
  • module is the right granularity for benchmark isolation

Dependency Chain

The fixture depends on sync_limiter which is function-scoped. We need to:

  1. Create a module-scoped limiter for benchmarks
  2. Or use a separate setup that doesn't depend on the function-scoped limiter

Option A: Module-scoped benchmark limiter

@pytest.fixture(scope="module")
def benchmark_limiter(mock_dynamodb_module):
    """Module-scoped limiter for benchmark entity setup."""
    with _patch_aiobotocore_response():
        limiter = SyncRateLimiter(name="benchmark", region="us-east-1")
        limiter._run(limiter._limiter._repository.create_table())
        with limiter:
            yield limiter

@pytest.fixture(scope="module")
def benchmark_entities(benchmark_limiter):
    """Pre-create entities once per module."""
    ...

Option B: Lazy initialization with caching

_cached_entities: list[str] | None = None

@pytest.fixture
def benchmark_entities(sync_limiter):
    global _cached_entities
    if _cached_entities is None:
        _cached_entities = _create_entities(sync_limiter)
    return _cached_entities

Recommendation: Option A is cleaner and more pytest-idiomatic.

Tasks

  • Create mock_dynamodb_module fixture with scope="module"
  • Create benchmark_limiter fixture with scope="module"
  • Change benchmark_entities to scope="module" depending on benchmark_limiter
  • Create BenchmarkEntities dataclass with flat + hierarchy entities
  • Switch test_throughput.py to use benchmark_entities
  • Switch test_latency.py to use benchmark_entities
  • Switch test_operations.py to use benchmark_entities (where appropriate)
  • Verify all benchmark tests still pass
  • Measure before/after benchmark run times
  • Investigate regression: module-scoped fixtures are slower not faster (see Performance Measurement)
  • Update tests/benchmark/conftest.py docstring to explain scoping strategy
  • Add guidance to .claude/rules/testing.md about fixture scope selection

Acceptance Criteria

  • Benchmark suite runs 30-50% faster (currently regressed, see below)
  • All benchmark tests pass
  • Entity setup runs once per file, not once per test
  • Documentation updated with scope guidance

Documentation Updates

.claude/rules/testing.md Addition (done)

### Fixture Scope Selection

| Scope | Use When | Example |
|-------|----------|---------|
| `function` | Test mutates state, needs isolation | `limiter` (each test gets clean state) |
| `class` | Expensive setup shared by class | `e2e_limiter` (CloudFormation stack) |
| `module` | Expensive setup shared by file | `benchmark_entities` (100 pre-warmed entities) |
| `session` | Immutable configuration | `localstack_endpoint` (env var read) |

**Rule**: If fixture setup takes >100ms and is used by multiple tests, consider broader scope.

Performance Measurement

Current results (branch perf/171-benchmark-fixture-scope)

The module-scoped approach is slower than the original function-scoped approach:

File Main (baseline) Feature branch Change
test_throughput.py (--benchmark-skip) 5.25-6.36s 9.94-10.64s +60% slower
test_latency.py (--benchmark-only) 8.76s 14.32s +63% slower
test_operations.py (--benchmark-only) 19.10s 26.05s +36% slower

Root cause analysis

The regression comes from the module-scoped benchmark_entities fixture pre-creating 111 entities (100 flat + 1 parent + 10 cascade children) with warmup acquires at module load time. This heavy upfront cost is paid once per test file but dominates the total time because:

  1. test_throughput.py: 7 tests that previously used sync_limiter and created entities lazily now pay the 111-entity setup cost upfront. The old approach only created entities as needed within each test.
  2. test_latency.py: 10 benchmark tests now depend on benchmark_entities, adding the 111-entity setup that didn't exist before (tests previously used sync_limiter with per-test entity creation).
  3. test_operations.py: Mixed -- some tests correctly use module-scoped benchmark_entities for steady-state benchmarks, while others use function-scoped sync_limiter for optimization comparisons. But the module-scoped setup still runs once for the file.

Next steps

The optimization hypothesis was wrong for the moto backend -- moto entity creation is cheap enough (~10ms per entity) that pre-creating 111 entities is slower than creating a few entities per test. The module-scoped approach may only pay off with:

  • LocalStack/real DynamoDB where entity creation is expensive
  • Files with many tests sharing the same entities (>20+ tests)

Options to fix:

  1. Revert to function-scoped fixtures for moto-based benchmarks and only use module-scoped for LocalStack benchmarks
  2. Reduce entity count in benchmark_entities (e.g., 10 flat + 1 parent + 3 children = 14 instead of 111)
  3. Keep BenchmarkEntities dataclass but use it function-scoped with the regular sync_limiter

Metadata

Metadata

Assignees

Labels

performancePerformance optimizationtestingTest coverage

Type

Fields

No fields configured for Chore.

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions