⚡ Fix benchmark_entities fixture scope for faster benchmark runs

## Summary

The `benchmark_entities` fixture in `tests/benchmark/conftest.py` creates 100 entities on **every test** that uses it. Changing to `scope="module"` would provide 30-50% faster benchmark runs.

## Problem

### Current Implementation

```python
# tests/benchmark/conftest.py:182-207
@pytest.fixture
def benchmark_entities(sync_limiter: Any) -> list[str]:
    """Pre-create entities with pre-warmed buckets for throughput tests."""
    entity_ids = [f"bench-entity-{i:03d}" for i in range(100)]
    limits = [Limit.per_minute("rpm", 1_000_000)]

    for entity_id in entity_ids:
        # Create entity
        sync_limiter.create_entity(entity_id, name=f"Benchmark Entity {entity_id}")
        # Pre-warm bucket by doing one acquire
        with sync_limiter.acquire(...):
            pass

    return entity_ids
```

### Issues

1. **Function-scoped (default)**: Creates 100 entities + 100 acquires **per test**
2. **Repeated setup overhead**: If 5 tests use this fixture, setup runs 5 times
3. **Slow benchmarks**: Each setup adds ~1-2 seconds (moto) or more (LocalStack)

### Impact Analysis

Looking at benchmark tests that use `benchmark_entities`:
- `test_throughput.py` - Multiple tests use pre-warmed entities
- `test_capacity.py` - Capacity counting tests

With 5+ tests using this fixture, we're wasting 5-10 seconds of repeated setup.

## Proposed Solution

Change fixture scope to `module` (shared across tests in same file):

```python
@pytest.fixture(scope="module")
def benchmark_entities(sync_limiter: Any) -> list[str]:
    """Pre-create entities with pre-warmed buckets for throughput tests.
    
    Module-scoped to avoid repeated setup across benchmark tests.
    Entities are created once per test file.
    """
    ...
```

### Why `module` and not `session`?

- `session` scope would share across all benchmark files
- Different files may need different entity configurations
- `module` is the right granularity for benchmark isolation

### Dependency Chain

The fixture depends on `sync_limiter` which is function-scoped. We need to:

1. Create a `module`-scoped limiter for benchmarks
2. Or use a separate setup that doesn't depend on the function-scoped limiter

**Option A**: Module-scoped benchmark limiter
```python
@pytest.fixture(scope="module")
def benchmark_limiter(mock_dynamodb_module):
    """Module-scoped limiter for benchmark entity setup."""
    with _patch_aiobotocore_response():
        limiter = SyncRateLimiter(name="benchmark", region="us-east-1")
        limiter._run(limiter._limiter._repository.create_table())
        with limiter:
            yield limiter

@pytest.fixture(scope="module")
def benchmark_entities(benchmark_limiter):
    """Pre-create entities once per module."""
    ...
```

**Option B**: Lazy initialization with caching
```python
_cached_entities: list[str] | None = None

@pytest.fixture
def benchmark_entities(sync_limiter):
    global _cached_entities
    if _cached_entities is None:
        _cached_entities = _create_entities(sync_limiter)
    return _cached_entities
```

**Recommendation**: Option A is cleaner and more pytest-idiomatic.

## Tasks

- [x] Create `mock_dynamodb_module` fixture with `scope="module"`
- [x] Create `benchmark_limiter` fixture with `scope="module"`
- [x] Change `benchmark_entities` to `scope="module"` depending on `benchmark_limiter`
- [x] Create `BenchmarkEntities` dataclass with flat + hierarchy entities
- [x] Switch `test_throughput.py` to use `benchmark_entities`
- [x] Switch `test_latency.py` to use `benchmark_entities`
- [x] Switch `test_operations.py` to use `benchmark_entities` (where appropriate)
- [ ] Verify all benchmark tests still pass
- [ ] Measure before/after benchmark run times
- [ ] Investigate regression: module-scoped fixtures are **slower** not faster (see Performance Measurement)
- [ ] Update `tests/benchmark/conftest.py` docstring to explain scoping strategy
- [x] Add guidance to `.claude/rules/testing.md` about fixture scope selection

## Acceptance Criteria

- [ ] Benchmark suite runs 30-50% faster (currently regressed, see below)
- [ ] All benchmark tests pass
- [ ] Entity setup runs once per file, not once per test
- [x] Documentation updated with scope guidance

## Documentation Updates

### `.claude/rules/testing.md` Addition (done)

```markdown
### Fixture Scope Selection

| Scope | Use When | Example |
|-------|----------|---------|
| `function` | Test mutates state, needs isolation | `limiter` (each test gets clean state) |
| `class` | Expensive setup shared by class | `e2e_limiter` (CloudFormation stack) |
| `module` | Expensive setup shared by file | `benchmark_entities` (100 pre-warmed entities) |
| `session` | Immutable configuration | `localstack_endpoint` (env var read) |

**Rule**: If fixture setup takes >100ms and is used by multiple tests, consider broader scope.
```

## Performance Measurement

### Current results (branch `perf/171-benchmark-fixture-scope`)

The module-scoped approach is **slower** than the original function-scoped approach:

| File | Main (baseline) | Feature branch | Change |
|------|-----------------|----------------|--------|
| `test_throughput.py` (`--benchmark-skip`) | 5.25-6.36s | 9.94-10.64s | **+60% slower** |
| `test_latency.py` (`--benchmark-only`) | 8.76s | 14.32s | **+63% slower** |
| `test_operations.py` (`--benchmark-only`) | 19.10s | 26.05s | **+36% slower** |

### Root cause analysis

The regression comes from the module-scoped `benchmark_entities` fixture pre-creating **111 entities** (100 flat + 1 parent + 10 cascade children) with warmup acquires at module load time. This heavy upfront cost is paid once per test file but dominates the total time because:

1. **`test_throughput.py`**: 7 tests that previously used `sync_limiter` and created entities lazily now pay the 111-entity setup cost upfront. The old approach only created entities as needed within each test.
2. **`test_latency.py`**: 10 benchmark tests now depend on `benchmark_entities`, adding the 111-entity setup that didn't exist before (tests previously used `sync_limiter` with per-test entity creation).
3. **`test_operations.py`**: Mixed -- some tests correctly use module-scoped `benchmark_entities` for steady-state benchmarks, while others use function-scoped `sync_limiter` for optimization comparisons. But the module-scoped setup still runs once for the file.

### Next steps

The optimization hypothesis was wrong for the moto backend -- moto entity creation is cheap enough (~10ms per entity) that pre-creating 111 entities is slower than creating a few entities per test. The module-scoped approach may only pay off with:
- LocalStack/real DynamoDB where entity creation is expensive
- Files with many tests sharing the same entities (>20+ tests)

Options to fix:
1. **Revert to function-scoped fixtures** for moto-based benchmarks and only use module-scoped for LocalStack benchmarks
2. **Reduce entity count** in `benchmark_entities` (e.g., 10 flat + 1 parent + 3 children = 14 instead of 111)
3. **Keep `BenchmarkEntities` dataclass** but use it function-scoped with the regular `sync_limiter`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

⚡ Fix benchmark_entities fixture scope for faster benchmark runs #171

Summary

Problem

Current Implementation

Issues

Impact Analysis

Proposed Solution

Why `module` and not `session`?

Dependency Chain

Tasks

Acceptance Criteria

Documentation Updates

`.claude/rules/testing.md` Addition (done)

Performance Measurement

Current results (branch `perf/171-benchmark-fixture-scope`)

Root cause analysis

Next steps

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

File	Main (baseline)	Feature branch	Change
`test_throughput.py` (`--benchmark-skip`)	5.25-6.36s	9.94-10.64s	+60% slower
`test_latency.py` (`--benchmark-only`)	8.76s	14.32s	+63% slower
`test_operations.py` (`--benchmark-only`)	19.10s	26.05s	+36% slower

Uh oh!

⚡ Fix benchmark_entities fixture scope for faster benchmark runs #171

Description

Summary

Problem

Current Implementation

Issues

Impact Analysis

Proposed Solution

Why module and not session?

Dependency Chain

Tasks

Acceptance Criteria

Documentation Updates

.claude/rules/testing.md Addition (done)

Performance Measurement

Current results (branch perf/171-benchmark-fixture-scope)

Root cause analysis

Next steps

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

Why `module` and not `session`?

`.claude/rules/testing.md` Addition (done)

Current results (branch `perf/171-benchmark-fixture-scope`)