Skip to content

Commit a9a1caf

Browse files
committed
docs: simplify README
1 parent 5ebac88 commit a9a1caf

4 files changed

Lines changed: 152 additions & 128 deletions

File tree

README.md

Lines changed: 23 additions & 128 deletions
Original file line numberDiff line numberDiff line change
@@ -38,63 +38,16 @@ flowchart LR
3838
- [.NET 10 SDK](https://dotnet.microsoft.com/download)
3939
- [Docker](https://www.docker.com/) (for SQL Server + LocalStack)
4040

41-
## Local Setup
41+
## Quick Start
4242

43-
### 1. Start infrastructure
43+
See **[docs/local-setup.md](docs/local-setup.md)** for the full setup walkthrough. In brief:
4444

4545
```bash
4646
docker-compose up -d
47-
```
48-
49-
This starts:
50-
51-
- **Azure SQL Edge** on `localhost:1433` (ARM64-compatible SQL Server)
52-
- **LocalStack** on `localhost:4566` (local SQS emulator — queues auto-created via `scripts/init-localstack.sh`)
53-
54-
### 2. Create the database
55-
56-
```bash
57-
docker exec -it $(docker ps -q -f ancestor=mcr.microsoft.com/azure-sql-edge) \
58-
/opt/mssql-tools/bin/sqlcmd -S localhost -U sa -P 'YourStr0ngPassw0rd!' \
59-
-Q "CREATE DATABASE ScanEvents"
60-
```
61-
62-
The worker auto-creates the `ProcessingState` and `ParcelSummary` tables on startup via `DatabaseInitializer`.
63-
64-
### 3. Set dummy AWS credentials (LocalStack doesn't validate them)
65-
66-
```bash
67-
export AWS_ACCESS_KEY_ID=test
68-
export AWS_SECRET_ACCESS_KEY=test
69-
```
70-
71-
### 4. Configure the API base URL (required)
72-
73-
The worker cannot connect without this. Set it via user-secrets:
74-
75-
```bash
76-
dotnet user-secrets set "ScanEventApi:BaseUrl" "https://your-api-host" \
77-
--project src/ScanEventWorker
78-
```
79-
80-
Or edit `ScanEventApi:BaseUrl` directly in `src/ScanEventWorker/appsettings.json`.
81-
82-
### 5. Run the worker
83-
84-
```bash
47+
dotnet user-secrets set "ScanEventApi:BaseUrl" "https://your-api-host" --project src/ScanEventWorker
8548
dotnet run --project src/ScanEventWorker/ScanEventWorker.csproj
8649
```
8750

88-
### 6. Verify resumability
89-
90-
Stop the worker (`Ctrl+C`) and restart it. The log will show:
91-
92-
```
93-
Resuming from LastEventId=<n>
94-
```
95-
96-
confirming it picks up from where it left off.
97-
9851
## Running Tests
9952

10053
```bash
@@ -123,87 +76,29 @@ Everything else (NuGet packages, SQS queues, DB schema) is self-contained.
12376

12477
## Assumptions
12578

126-
- Events are returned ordered by `EventId` ascending
127-
- `EventId` is monotonically increasing — querying `FromEventId=X` reliably returns all events with ID ≥ X
128-
- The API returns an empty `ScanEvents` array when no more events exist (end-of-feed signal)
129-
- Only one worker instance runs at a time (no distributed locking required)
130-
- `RunId` comes from the nested `User.RunId` field in the JSON response
131-
- `StatusCode` may be an empty string
132-
- `PickedUpAtUtc` and `DeliveredAtUtc` are set on the **first** occurrence of their respective event types and are never overwritten by later events of the same type
133-
- Unknown `Type` values are stored as-is in `ParcelSummary` without setting pickup/delivery timestamps
79+
1. Events are returned ordered by `EventId` ascending
80+
2. `EventId` is monotonically increasing — querying `FromEventId=X` reliably returns all events with ID ≥ X
81+
3. The API returns an empty `ScanEvents` array when no more events exist (end-of-feed signal)
82+
4. Only one worker instance runs at a time (no distributed locking required)
83+
5. `RunId` comes from the nested `User.RunId` field in the JSON response
84+
6. `StatusCode` may be an empty string
85+
7. `PickedUpAtUtc` and `DeliveredAtUtc` are set on the **first** occurrence of their respective event types and are never overwritten by later events of the same type
86+
8. Unknown `Type` values are stored as-is in `ParcelSummary` without setting pickup/delivery timestamps
13487

13588
## Potential Improvements
13689

137-
- **Health checks** expose `/healthz` endpoint reporting queue depth and DB connectivity
138-
- **Metrics** OpenTelemetry counters for events processed/sec, SQS queue depth, DLQ size
139-
- **Horizontal scaling** run multiple `EventProcessorWorker` instances; SQS competing-consumer model handles this without coordination
140-
- **DLQ visibility** persist DLQ messages to a `FailedEvents` DB table for operational queries
141-
- **Rate limiting** token bucket on the API poller to avoid hammering the upstream service
142-
- **Database migrations**replace the `IF NOT EXISTS` initializer with FluentMigrator for versioned schema changes
143-
- **Production database**RDS SQL Server (added to CDK stack) instead of Docker for managed backups and HA
90+
- [ ] **Health checks**: expose `/healthz` endpoint reporting queue depth and DB connectivity
91+
- [ ] **Metrics**: OpenTelemetry counters for events processed/sec, SQS queue depth, DLQ size
92+
- [ ] **Horizontal scaling**: run multiple `EventProcessorWorker` instances; SQS competing-consumer model handles this without coordination
93+
- [ ] **DLQ visibility**: persist DLQ messages to a `FailedEvents` DB table for operational queries
94+
- [ ] **Rate limiting**: token bucket on the API poller to avoid hammering the upstream service
95+
- [ ] **Database migrations**: replace the `IF NOT EXISTS` initialiser with FluentMigrator for versioned schema changes
96+
- [ ] **Production database**: RDS SQL Server (managed via CDK stack) instead of Docker for managed backups and HA
14497

145-
## Downstream Workers Architecture
146-
147-
The current design is ready for fan-out without changes to this worker. Adding an SNS topic makes the event stream available to any number of downstream consumers:
148-
149-
```mermaid
150-
flowchart LR
151-
POLLER["API Poller"]
152-
SNS(["SNS Topic\n(fan-out)"])
153-
QA[["SQS Queue A"]]
154-
QB[["SQS Queue B"]]
155-
QC[["SQS Queue C"]]
156-
W1["This Worker → DB"]
157-
W2["Downstream Worker 1"]
158-
W3["Downstream Worker 2"]
159-
160-
POLLER --> SNS
161-
SNS --> QA --> W1
162-
SNS --> QB --> W2
163-
SNS --> QC --> W3
164-
```
165-
166-
**Alternatives considered:**
167-
168-
- **Outbox pattern** — write events to an outbox table, dedicated publisher reads and publishes to SNS/SQS. Stronger consistency guarantee but adds DB coupling and latency.
169-
- **CDC (Change Data Capture)** — enable SQL Server CDC on `ParcelSummary`, stream changes via Kafka Connect. Best for consumers that need the DB state, not the raw events.
170-
171-
## CDK Infrastructure
172-
173-
The `src/ScanEventWorker.Cdk` project defines the AWS infrastructure for production deployment:
174-
175-
```bash
176-
cd src/ScanEventWorker.Cdk
177-
cdk deploy
178-
```
179-
180-
Provisions:
181-
182-
- `scan-events-queue` — main SQS queue (visibility timeout: 30s)
183-
- `scan-events-dlq` — dead letter queue with `maxReceiveCount=3` redrive policy
184-
185-
> **Local development:** queues are created automatically by LocalStack when you run `docker-compose up` (via `scripts/init-localstack.sh`). You do not need to run `cdk deploy` locally.
186-
187-
## Design Rationale
188-
189-
This section documents the architectural decisions and patterns in the codebase, with particular attention to where DDD and FP principles were applied deliberately.
190-
191-
### Rich Domain Model
192-
193-
`ParcelSummary` is not a data bag. Its `ApplyScanEvent()` method encapsulates three business rules in one place: idempotency (events with `EventId <= LatestEventId` are silently ignored, making the MERGE-based persistence safe for at-least-once SQS delivery), first-occurrence timestamp protection (`PickedUpAtUtc ??= ...` and `DeliveredAtUtc ??= ...` ensure a later pickup event after delivery never overwrites the original), and event-type semantics (the `switch` on `ScanEventTypes` is the sole location in the codebase that interprets what an event type means). `ScanEventProcessor` is deliberately thin: it delegates business rules to the domain object and wraps infrastructure errors in `Result<T>`.
194-
195-
### Railway-Oriented Error Handling
196-
197-
`Result<T>` is a `readonly struct` with `Success`/`Failure` factory methods and a `Match()` combinator. It is used at every infrastructure boundary: `IScanEventApiClient.GetScanEventsAsync` returns parse and HTTP failures as values rather than throwing; `IScanEventProcessor.ProcessSingleAsync` wraps database errors without propagating them; `ScanEventProcessor.ProcessBatchAsync` processes each event independently so one failure does not abort the batch. Exceptions are reserved for genuinely unrecoverable failures such as missing configuration on startup.
198-
199-
### Decoupled Workers via SQS
200-
201-
`ApiPollerWorker` and `EventProcessorWorker` communicate exclusively through SQS, which provides several properties the design relies on. Fault isolation: a database outage halts `EventProcessorWorker` without affecting `ApiPollerWorker`; events accumulate in the queue and drain automatically on recovery. At-least-once delivery: SQS redelivers unacknowledged messages after the visibility timeout, and the idempotent MERGE in `ApplyScanEvent()` makes redelivery safe. Dead-letter handling: after three failed attempts, SQS moves messages to the DLQ for manual inspection with no bespoke retry logic required. Independent scaling: multiple `EventProcessorWorker` instances can compete for messages without coordination.
202-
203-
### Contracts Over Concretions
204-
205-
All cross-cutting dependencies are defined as interfaces in `ScanEventWorker.Contracts`: `IScanEventApiClient`, `IScanEventRepository`, `IMessageQueue`, and `IScanEventProcessor`. The boundaries are chosen at architectural seams, not merely for test convenience. `IMessageQueue` hides the SQS SDK behind a three-method surface; `IScanEventProcessor` separates orchestration (the worker) from business logic (the processor). This lets the BackgroundService tests use NSubstitute mocks with zero real infrastructure.
98+
---
20699

207-
### AOT as a Design Constraint
100+
## Further Reading
208101

209-
Native AOT (`PublishAot=true`) prohibits runtime reflection, which shaped several decisions. JSON serialization uses `[JsonSerializable]` source-gen contexts rather than `JsonSerializer` with runtime type discovery. Data access uses Dapper.AOT with the `[DapperAot]` attribute and interceptor-based source generation instead of EF Core. Value objects (`EventId`, `ParcelId`) are `readonly record struct` types: zero heap allocation, comparable by value, and fully AOT-safe. The constraint made the codebase more explicit at the cost of additional boilerplate, but also eliminated whole categories of runtime surprises.
102+
- [Local Setup](docs/local-setup.md) — Docker, database, credentials, and run steps
103+
- [Infrastructure](docs/infrastructure.md) — CDK stack and downstream fan-out architecture
104+
- [Design Rationale](docs/design-rationale.md) — Domain model, error handling, and AOT decisions

docs/design-rationale.md

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
# Design Rationale
2+
3+
This document covers the architectural decisions and patterns in the codebase, with particular attention to where DDD and FP principles were applied deliberately.
4+
5+
## Rich Domain Model
6+
7+
`ParcelSummary` is not a data bag. Its `ApplyScanEvent()` method encapsulates three business rules in one place: idempotency (events with `EventId <= LatestEventId` are silently ignored, making the MERGE-based persistence safe for at-least-once SQS delivery), first-occurrence timestamp protection (`PickedUpAtUtc ??= ...` and `DeliveredAtUtc ??= ...` ensure a later pickup event after delivery never overwrites the original), and event-type semantics (the `switch` on `ScanEventTypes` is the sole location in the codebase that interprets what an event type means). `ScanEventProcessor` is deliberately thin: it delegates business rules to the domain object and wraps infrastructure errors in `Result<T>`.
8+
9+
## Railway-Oriented Error Handling
10+
11+
`Result<T>` is a `readonly struct` with `Success`/`Failure` factory methods and a `Match()` combinator. It is used at every infrastructure boundary: `IScanEventApiClient.GetScanEventsAsync` returns parse and HTTP failures as values rather than throwing; `IScanEventProcessor.ProcessSingleAsync` wraps database errors without propagating them; `ScanEventProcessor.ProcessBatchAsync` processes each event independently so one failure does not abort the batch. Exceptions are reserved for genuinely unrecoverable failures such as missing configuration on startup.
12+
13+
## Decoupled Workers via SQS
14+
15+
`ApiPollerWorker` and `EventProcessorWorker` communicate exclusively through SQS, which provides several properties the design relies on. Fault isolation: a database outage halts `EventProcessorWorker` without affecting `ApiPollerWorker`; events accumulate in the queue and drain automatically on recovery. At-least-once delivery: SQS redelivers unacknowledged messages after the visibility timeout, and the idempotent MERGE in `ApplyScanEvent()` makes redelivery safe. Dead-letter handling: after three failed attempts, SQS moves messages to the DLQ for manual inspection with no bespoke retry logic required. Independent scaling: multiple `EventProcessorWorker` instances can compete for messages without coordination.
16+
17+
## Contracts Over Concretions
18+
19+
All cross-cutting dependencies are defined as interfaces in `ScanEventWorker.Contracts`: `IScanEventApiClient`, `IScanEventRepository`, `IMessageQueue`, and `IScanEventProcessor`. The boundaries are chosen at architectural seams, not merely for test convenience. `IMessageQueue` hides the SQS SDK behind a three-method surface; `IScanEventProcessor` separates orchestration (the worker) from business logic (the processor). This lets the BackgroundService tests use NSubstitute mocks with zero real infrastructure.
20+
21+
## AOT as a Design Constraint
22+
23+
Native AOT (`PublishAot=true`) prohibits runtime reflection, which shaped several decisions. JSON serialisation uses `[JsonSerializable]` source-gen contexts rather than `JsonSerializer` with runtime type discovery. Data access uses Dapper.AOT with the `[DapperAot]` attribute and interceptor-based source generation instead of EF Core. Value objects (`EventId`, `ParcelId`) are `readonly record struct` types: zero heap allocation, comparable by value, and fully AOT-safe. The constraint made the codebase more explicit at the cost of additional boilerplate, but also eliminated whole categories of runtime surprises.

docs/infrastructure.md

Lines changed: 43 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,43 @@
1+
# Infrastructure
2+
3+
## CDK Stack
4+
5+
The `src/ScanEventWorker.Cdk` project defines the AWS infrastructure for production deployment:
6+
7+
```bash
8+
cd src/ScanEventWorker.Cdk
9+
cdk deploy
10+
```
11+
12+
Provisions:
13+
14+
- `scan-events-queue` — main SQS queue (visibility timeout: 30s)
15+
- `scan-events-dlq` — dead letter queue with `maxReceiveCount=3` redrive policy
16+
17+
> **Local development:** queues are created automatically by LocalStack when you run `docker-compose up` (via `scripts/init-localstack.sh`). You do not need to run `cdk deploy` locally.
18+
19+
## Downstream Workers Architecture
20+
21+
The current design is ready for fan-out without changes to this worker. Adding an SNS topic makes the event stream available to any number of downstream consumers:
22+
23+
```mermaid
24+
flowchart LR
25+
POLLER["API Poller"]
26+
SNS(["SNS Topic\n(fan-out)"])
27+
QA[["SQS Queue A"]]
28+
QB[["SQS Queue B"]]
29+
QC[["SQS Queue C"]]
30+
W1["This Worker → DB"]
31+
W2["Downstream Worker 1"]
32+
W3["Downstream Worker 2"]
33+
34+
POLLER --> SNS
35+
SNS --> QA --> W1
36+
SNS --> QB --> W2
37+
SNS --> QC --> W3
38+
```
39+
40+
**Alternatives considered:**
41+
42+
- **Outbox pattern** — write events to an outbox table, dedicated publisher reads and publishes to SNS/SQS. Stronger consistency guarantee but adds DB coupling and latency.
43+
- **CDC (Change Data Capture)** — enable SQL Server CDC on `ParcelSummary`, stream changes via Kafka Connect. Best for consumers that need the DB state, not the raw events.

docs/local-setup.md

Lines changed: 63 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,63 @@
1+
# Local Setup
2+
3+
## Prerequisites
4+
5+
- [.NET 10 SDK](https://dotnet.microsoft.com/download)
6+
- [Docker](https://www.docker.com/) (for SQL Server + LocalStack)
7+
8+
## Steps
9+
10+
### 1. Start infrastructure
11+
12+
```bash
13+
docker-compose up -d
14+
```
15+
16+
This starts:
17+
18+
- **Azure SQL Edge** on `localhost:1433` (ARM64-compatible SQL Server)
19+
- **LocalStack** on `localhost:4566` (local SQS emulator — queues auto-created via `scripts/init-localstack.sh`)
20+
21+
### 2. Create the database
22+
23+
```bash
24+
docker exec -it $(docker ps -q -f ancestor=mcr.microsoft.com/azure-sql-edge) \
25+
/opt/mssql-tools/bin/sqlcmd -S localhost -U sa -P 'YourStr0ngPassw0rd!' \
26+
-Q "CREATE DATABASE ScanEvents"
27+
```
28+
29+
The worker auto-creates the `ProcessingState` and `ParcelSummary` tables on startup via `DatabaseInitializer`.
30+
31+
### 3. Set dummy AWS credentials (LocalStack doesn't validate them)
32+
33+
```bash
34+
export AWS_ACCESS_KEY_ID=test
35+
export AWS_SECRET_ACCESS_KEY=test
36+
```
37+
38+
### 4. Configure the API base URL (required)
39+
40+
The worker cannot connect without this. Set it via user-secrets:
41+
42+
```bash
43+
dotnet user-secrets set "ScanEventApi:BaseUrl" "https://your-api-host" \
44+
--project src/ScanEventWorker
45+
```
46+
47+
Or edit `ScanEventApi:BaseUrl` directly in `src/ScanEventWorker/appsettings.json`.
48+
49+
### 5. Run the worker
50+
51+
```bash
52+
dotnet run --project src/ScanEventWorker/ScanEventWorker.csproj
53+
```
54+
55+
### 6. Verify resumability
56+
57+
Stop the worker (`Ctrl+C`) and restart it. The log will show:
58+
59+
```
60+
Resuming from LastEventId=<n>
61+
```
62+
63+
confirming it picks up from where it left off.

0 commit comments

Comments
 (0)