You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/infrastructure.md
+59Lines changed: 59 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -41,3 +41,62 @@ flowchart LR
41
41
42
42
-**Outbox pattern** — write events to an outbox table, dedicated publisher reads and publishes to SNS/SQS. Stronger consistency guarantee but adds DB coupling and latency.
43
43
-**CDC (Change Data Capture)** — enable SQL Server CDC on `ParcelSummary`, stream changes via Kafka Connect. Best for consumers that need the DB state, not the raw events.
44
+
45
+
---
46
+
47
+
## Spec Constraint: Why Polling Exists
48
+
49
+
The entire `ApiPollerWorker` exists solely because the spec defines a **pull-only GET endpoint**. This is the root architectural constraint, and it has a cascading effect on every design decision in this codebase.
| Operational burden | Cursor state in SQL, DLQ monitoring | Managed by the platform |
61
+
62
+
Polling is a workaround, not a design choice. If the upstream offered a push mechanism, `ApiPollerWorker`, `ProcessingState`, and `SqsMessageQueue` could all be deleted.
63
+
64
+
### The Unconstrained Design
65
+
66
+
If the scan event source could emit events (webhook, DynamoDB Streams, EventBridge, or S3 notifications), the architecture collapses into a fully serverless, event-driven pipeline:
67
+
68
+
```mermaid
69
+
architecture-beta
70
+
service source(internet)[Scan Event Source]
71
+
72
+
group aws(logos:aws)[AWS]
73
+
74
+
service eb(logos:aws-eventbridge)[EventBridge or SNS] in aws
75
+
service sfn(logos:aws-stepfunctions)[Step Functions] in aws
76
+
service fn(logos:aws-lambda)[Lambda durable functions] in aws
77
+
service db(logos:aws-dynamodb)[DynamoDB] in aws
78
+
service s3(logos:aws-s3)[S3] in aws
79
+
80
+
source:R --> L:eb
81
+
eb:R --> L:sfn
82
+
sfn:T --> T:fn
83
+
eb:B --> T:fn
84
+
sfn:R --> L:db
85
+
sfn:B --> T:s3
86
+
```
87
+
88
+
**Each component earns its place:**
89
+
90
+
-**EventBridge / SNS** — zero-config fan-out; downstream consumers subscribe without any changes to the producer
91
+
-**AWS Step Functions** — replaces `EventProcessorWorker`'s manual retry/DLQ logic with durable, visual orchestration; retries, catch blocks, and compensating transactions are declared, not coded
92
+
-**DynamoDB** — replaces SQL Server with a serverless, horizontally scaled store; DynamoDB Streams can trigger further Lambdas for free, enabling second-order fan-out with no extra infrastructure
93
+
-**S3** — each raw event lands in an S3 object on arrival; satisfies compliance audit requirements without any schema migration
94
+
-**Lambda** — each invocation is independent; a crash affects one event, not the entire feed; cold-start latency is acceptable at 5-second polling granularity anyway
95
+
96
+
### Why This Matters for Microservices
97
+
98
+
The polling model creates a centralised bottleneck: exactly one process owns the cursor, owns the queue writes, and owns the fan-out decision. In a microservices context this is an anti-pattern — every downstream team depends on this worker staying healthy.
99
+
100
+
An event-driven source eliminates the coupling entirely. Downstream services subscribe to EventBridge or SNS directly; the scan event producer has no knowledge of consumers, and neither does this worker.
101
+
102
+
The current SQS abstraction (`IMessageQueue`) was designed with this migration in mind: replacing `SqsMessageQueue` with an EventBridge publisher requires changing one DI registration in `Program.cs`.
0 commit comments