chore: restructure test suite with unit/integration/e2e separation

## Summary

Restructure the test suite to separate fast unit tests from integration tests and end-to-end tests, following [pytest best practices](https://docs.pytest.org/en/stable/explanation/goodpractices.html) for test organization, fixture scoping, and selective execution.

## Motivation

- **Tests are slow**: Every test recreates mock DynamoDB tables (`mock_aws` + `create_tables` with `wait=True` for 3 tables) and spins up a full `TestClient` (triggering the app lifespan — logging, DB init, queue worker start/stop) per test function
- **Tests hang**: `test_example_endpoint` calls the real `/example` endpoint which runs `ftw download` / `ftw infer` subprocess calls against external URLs — there is no mock for `InferenceService`, so it attempts real satellite image downloads
- **No way to run fast checks**: All 49 tests are treated identically — pure validation tests (Pydantic schema checks) run through the same expensive fixture chain as full API integration tests
- **Fixture scope issues**: `dynamodb_tables` at `scope="function"` means moto setup/teardown runs ~30 times for tests that don't mutate shared state

## Current problems in detail

### 1. Expensive per-test setup
`conftest.py` fixtures are all `scope="function"`:
- `dynamodb_tables`: `mock_aws()` context + `create_tables()` (3 tables, `wait=True`) — every test
- `client`: `TestClient(app)` triggers full lifespan (`initialize_logging`, `initialize_database`, `initialize_services`, `start_background_workers`) — every test
- Lifespan teardown calls `stop_background_workers()` → `asyncio.gather` on `InMemoryQueue` workers — every test

### 2. No ML pipeline mocking for integration tests
`test_example_endpoint` sends real STAC URLs to the `/example` endpoint. The `InferenceService` dependency is never overridden, so it calls:
- `download_images()` → `ftw download` subprocess (network I/O)
- `execute_inference_pipeline()` → `ftw infer` subprocess (requires model checkpoint)
- `run_polygonize()` → `ftw polygonize` subprocess

This hangs indefinitely waiting for network/subprocess completion. The test is valid as an e2e test but should not run by default.

### 3. Tests that don't need DynamoDB still pay for it
`TestModelValidation` in `test_model_validation.py` only tests Pydantic validation (`InferenceRequest`), yet runs through the `client` fixture chain because `TestModelEndpoints` and `TestModelIntegration` in the same file do need it.

## Proposed changes

### Test tiers

| Tier | Marker | What it tests | Runs when |
|------|--------|--------------|-----------|
| **unit** | `@pytest.mark.unit` | Pure logic, Pydantic validation, config, no I/O | Every commit, default `uv run test` |
| **integration** | `@pytest.mark.integration` | API endpoints through moto + mocked ML pipeline | Every commit, default `uv run test` |
| **e2e** | `@pytest.mark.e2e` | Real `/example` endpoint → real `ftw` subprocess calls, network, model checkpoint | Manual or CI with model + network access |

The existing `test_example_endpoint` is preserved as an **e2e test** — it continues to call the real ML pipeline. A separate integration-level test should be added that mocks the pipeline to verify endpoint wiring without requiring network or model checkpoints.

### Directory structure
```
server/tests/
├── conftest.py                    # Shared fixtures (markers, simple data fixtures)
├── unit/
│   ├── conftest.py               # Unit-specific: no moto, no TestClient
│   ├── test_model_validation.py  # Pydantic schema validation
│   ├── test_source_coop.py       # Storage config/key generation (pure mocking)
│   ├── test_storage_factory.py   # get_storage() selection logic
│   └── test_name_generator.py    # Name generation logic
├── integration/
│   ├── conftest.py               # Integration: moto fixtures, TestClient, mock InferenceService
│   ├── test_api.py               # API endpoint tests (mocked ML pipeline)
│   ├── test_workflows.py         # Multi-step workflow tests
│   └── test_storage.py           # LocalStorage upload/download cycle
└── e2e/
    ├── conftest.py               # E2E: real services, requires model checkpoint + network
    └── test_example_pipeline.py  # Real /example endpoint → ftw download/infer/polygonize
```

### Pytest configuration
```toml
[tool.pytest.ini_options]
testpaths = ["server/tests"]
asyncio_mode = "auto"
markers = [
    "unit: Fast isolated tests (no I/O, no moto, no TestClient)",
    "integration: Tests requiring moto DynamoDB or full HTTP client",
    "e2e: End-to-end tests requiring model checkpoint and network access",
]
addopts = "-m 'not e2e'"
```

The `addopts = "-m 'not e2e'"` ensures e2e tests are excluded by default. To run them explicitly:
```bash
pytest -m e2e
pytest -m ""  # override addopts to run everything
```

### Fixture changes
- **`dynamodb_tables`**: Keep `scope="function"` per [moto best practices](https://docs.getmoto.org/en/latest/docs/getting_started.html) — clean state per test, move to `integration/conftest.py`
- **`client`**: Move to `integration/conftest.py`, override `InferenceService` dependency to prevent real ML pipeline calls
- **Lifespan**: Either override with no-op for tests (avoids redundant DB init + worker lifecycle) or ensure dependency overrides prevent real work
- **Data fixtures** (`sample_bbox`, `model_ids`, etc.): Keep in root `conftest.py` — shared across all tiers
- **E2E fixtures**: Separate `e2e/conftest.py` with real `TestClient` (no mocks, full lifespan), skips if model checkpoint not present

### Run commands
```bash
uv run test                                    # Unit + integration (default, e2e excluded)
uv run --group test pytest -m unit             # Fast unit checks (~seconds)
uv run --group test pytest -m integration      # Integration tests only
uv run --group test pytest -m e2e              # E2E tests (requires model + network)
uv run --group test pytest -m ""               # Everything including e2e
```

## Checklist

- [ ] Add `markers` and `addopts` to `[tool.pytest.ini_options]` in `pyproject.toml`
- [ ] Create `server/tests/unit/`, `server/tests/integration/`, and `server/tests/e2e/` directories with `__init__.py` and `conftest.py`
- [ ] Move tests into appropriate directories based on their dependencies
- [ ] Mark test classes/functions with `@pytest.mark.unit`, `@pytest.mark.integration`, or `@pytest.mark.e2e`
- [ ] Preserve `test_example_endpoint` as `@pytest.mark.e2e` in `e2e/test_example_pipeline.py` (real pipeline, unchanged)
- [ ] Add new integration-level example endpoint test with mocked `InferenceService` (verifies endpoint wiring without network)
- [ ] Add `InferenceService` mock/override to `integration/conftest.py`
- [ ] Move moto + TestClient fixtures to `integration/conftest.py`
- [ ] Keep shared data fixtures in root `conftest.py`
- [ ] Add skip condition for e2e tests when model checkpoint is not present
- [ ] Add convenience script entries for `uv run test-unit` / `uv run test-integration` / `uv run test-e2e`
- [ ] Verify all tiers pass independently

## References

- [Pytest: Good Integration Practices](https://docs.pytest.org/en/stable/explanation/goodpractices.html)
- [Pytest: How to mark test functions with attributes](https://docs.pytest.org/en/stable/how-to/mark.html)
- [Pytest: Fixture scopes](https://docs.pytest.org/en/stable/how-to/fixtures.html#scope-sharing-fixtures-across-classes-modules-packages-or-session)
- [FastAPI: Testing](https://fastapi.tiangolo.com/tutorial/testing/)
- [FastAPI: Testing Events (Lifespan)](https://fastapi.tiangolo.com/advanced/testing-events/)
- [FastAPI: Async Tests](https://fastapi.tiangolo.com/advanced/async-tests/)
- [moto: Getting Started](https://docs.getmoto.org/en/latest/docs/getting_started.html)

Tier	Marker	What it tests	Runs when
unit	`@pytest.mark.unit`	Pure logic, Pydantic validation, config, no I/O	Every commit, default `uv run test`
integration	`@pytest.mark.integration`	API endpoints through moto + mocked ML pipeline	Every commit, default `uv run test`
e2e	`@pytest.mark.e2e`	Real `/example` endpoint → real `ftw` subprocess calls, network, model checkpoint	Manual or CI with model + network access

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore: restructure test suite with unit/integration/e2e separation #72

Summary

Motivation

Current problems in detail

1. Expensive per-test setup

2. No ML pipeline mocking for integration tests

3. Tests that don't need DynamoDB still pay for it

Proposed changes

Test tiers

Directory structure

Pytest configuration

Fixture changes

Run commands

Checklist

References

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

chore: restructure test suite with unit/integration/e2e separation #72

Description

Summary

Motivation

Current problems in detail

1. Expensive per-test setup

2. No ML pipeline mocking for integration tests

3. Tests that don't need DynamoDB still pay for it

Proposed changes

Test tiers

Directory structure

Pytest configuration

Fixture changes

Run commands

Checklist

References

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions