Skip to content
Merged
Show file tree
Hide file tree
Changes from 33 commits
Commits
Show all changes
36 commits
Select commit Hold shift + click to select a range
c470c55
docs: add load test messaging workers design
claude Apr 21, 2026
70502fd
docs: add load test messaging workers implementation plan
claude Apr 21, 2026
182ef25
feat(loadgen): scaffold main.go with subcommand dispatch
claude Apr 21, 2026
b6e9ac1
feat(loadgen): add Preset type and four built-in presets
claude Apr 21, 2026
354afa0
test(loadgen): guard preset lookup ok in uniform/realistic shape tests
claude Apr 21, 2026
2ae8310
feat(loadgen): deterministic fixture generation from (preset, seed)
claude Apr 21, 2026
7cd9a80
test(loadgen): drop unused default branch in realistic room-type switch
claude Apr 21, 2026
3df2cd6
fix(loadgen): address gocritic/errcheck findings in preset.go
claude Apr 21, 2026
4641d98
refactor(loadgen): pass Preset by pointer; revert lint config bump
claude Apr 21, 2026
b53be6b
test(loadgen): cover pickMembers padding and sampleWithoutReplacement…
claude Apr 21, 2026
9a9b6bc
feat(loadgen): Seed and Teardown mongo collections from fixtures
claude Apr 21, 2026
3787437
feat(loadgen): Prometheus registry with loadgen collectors
claude Apr 21, 2026
68e48b3
feat(loadgen): collector correlates publishes with replies and broadc…
claude Apr 22, 2026
3d483b8
fix(loadgen): close race in Collector samples; add coverage tests
claude Apr 22, 2026
c10f31b
feat(loadgen): percentiles, summary printer, CSV export, exit code
claude Apr 22, 2026
9ef41ef
test(loadgen): drop redundant nolint; _test.go is already excluded fr…
claude Apr 22, 2026
a5d86e5
feat(loadgen): open-loop generator with injected publisher
claude Apr 22, 2026
7e79a79
fix(loadgen): clear Collector orphans on publish failure; tighten tests
claude Apr 22, 2026
2bad977
feat(loadgen): JetStream consumer-lag sampler
claude Apr 22, 2026
9c8d962
fix(loadgen): warn (not debug) on consumer poll errors; document Snap…
claude Apr 22, 2026
021a409
feat(loadgen): wire seed/run/teardown subcommands in main.go
claude Apr 22, 2026
eac94f2
fix(loadgen): skip byReqID in canonical mode to avoid false missing-r…
claude Apr 22, 2026
b4ea921
feat(loadgen): docker-compose harness, Dockerfile, grafana dashboard
claude Apr 22, 2026
feb4c19
fix(loadgen): drop NATS scrape job (port 8222 serves JSON, not Promet…
claude Apr 22, 2026
d3b1e54
feat(loadgen): scoped Makefile for harness
claude Apr 22, 2026
6084ba7
test(loadgen): integration test for end-to-end wiring
claude Apr 22, 2026
dd19404
docs(loadgen): add operator README
claude Apr 22, 2026
69c0eab
test(loadgen): add unit tests for main helpers and sampler Snapshot
claude Apr 22, 2026
57d9f93
fix(loadgen): address final review — indexes, canonical rate, DM broa…
claude Apr 22, 2026
1905810
refactor(loadgen): simplify pass — pre-compute content, unify handler…
claude Apr 22, 2026
eb8eea8
fix(loadgen): split sent counter into warmup/measured phases for clea…
claude Apr 22, 2026
fdde0d0
fix(loadgen): index users.account so broadcast-worker enrichment isn'…
claude Apr 24, 2026
54acee8
perf(loadgen): dispatch publishes to worker pool; add opt-in pprof
claude Apr 24, 2026
45ff2ad
Merge branch 'main' into claude/load-test-messaging-workers-tDKZn
hmchangw Apr 27, 2026
8a9e64d
fix: group to channel
hmchangw Apr 27, 2026
6ef91ee
fix linting
hmchangw Apr 27, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2,780 changes: 2,780 additions & 0 deletions docs/superpowers/plans/2026-04-21-load-test-messaging-workers.md

Large diffs are not rendered by default.

Large diffs are not rendered by default.

203 changes: 203 additions & 0 deletions docs/superpowers/specs/2026-04-24-loadgen-worker-pool-design.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,203 @@
# Loadgen Worker-Pool Dispatch + pprof — Design

## Purpose

The loadgen's actual publish rate falls materially below the target rate at
moderate throughput. At `--rate=1000` observed actual rate is ~775 msg/s
(~77% delivery). Root cause: the publisher runs on the `time.Ticker`'s
goroutine serially, and `time.Ticker` drops ticks that fire while a publish
is still in progress. Any per-publish stall (NATS write-lock contention,
GC pause, scheduler hiccup) above the 1 ms/tick budget silently loses a
tick.

This spec fixes that by dispatching publishes to a small worker pool and
adds opt-in pprof so future bottlenecks are diagnosable.

## Scope

### In scope

- `Generator.Run` dispatches each tick's publish to a bounded pool of
goroutines. The ticker itself stays punctual.
- New env var `MAX_IN_FLIGHT` (default `200`) caps concurrent publishes.
Saturation (pool full when a tick fires) is an explicit signal, not a
silent drop: the ticker records
`loadgen_publish_errors_total{reason="saturated"}` and moves on.
- `MAX_IN_FLIGHT=0` falls back to the current serial behavior. Useful as
a bisection tool and a conservative default for whoever wants
reproducible comparisons.
- On graceful shutdown / `ctx.Done()`, `Run` returns only after all
in-flight publishes drain (bounded by a small timeout).
- New env var `PPROF_ADDR` (default `""`, meaning disabled). When set
(e.g. `:6060`), loadgen exposes `net/http/pprof` handlers on a
separate HTTP server. Never on by default — pprof isn't exposed in
production-ish deployments unless the operator opts in.
- Docker-compose loadgen service documents both new env vars.

### Out of scope

- Changes to the Collector, ConsumerSampler, Report, Preset, Seed, or
integration test — none are publish-hot-path.
- `golang.org/x/time/rate.Limiter` — the worker-pool fix addresses the
real structural cause (ticker/publish coupling). If worker-pool
saturation becomes the new bottleneck, re-evaluate then.
- `sync.Pool` allocation-reuse tuning — defer until pprof identifies GC
as the next-order concern.
- Dedicated NATS connection for publishes vs. subscriptions — only
justified if pprof identifies the NATS write lock as the bottleneck
after the worker pool lands.
- Default-rate bump — reasoned about separately.

## Architecture

Before:

```text
ticker goroutine: [wait tick] → publishOne (JSON + NATS write + metrics) → [wait tick] → …
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
one slow call here silently loses a tick
```

After:

```text
ticker goroutine: [wait tick] → reserve sem slot → spawn publish goroutine → [wait tick] → …

publish goroutine: [publishOne] → release sem slot
publish goroutine: [publishOne] → release sem slot
publish goroutine: [publishOne] → release sem slot (up to MAX_IN_FLIGHT concurrently)
```

The ticker goroutine's per-tick work shrinks to a semaphore send + goroutine
spawn — tens of nanoseconds. It cannot overshoot the ticker interval at any
realistic rate.

## Components

### `Generator.Run` (modified)

- Read `g.cfg.MaxInFlight` from `GeneratorConfig`.
- If `MaxInFlight <= 0`: run serially as today (preserves legacy behavior
and gives a bisection switch).
- Else: create `sem := make(chan struct{}, MaxInFlight)` and
`var wg sync.WaitGroup`. On each tick, non-blocking `select`:
- Slot available: take it, `wg.Add(1)`, `go func() { defer wg.Done();
defer func() { <-sem }(); g.publishOne(ctx) }()`.
- No slot: increment
`loadgen_publish_errors_total{reason="saturated"}` and continue —
the tick is dropped but at least it's observable.
- On `ctx.Done()`: stop the ticker, then `wg.Wait()` with a bounded grace
period (5 s). If the grace expires, log and return — in-flight
goroutines complete on their own after NATS drain in main.

### `GeneratorConfig` (modified)

Add one field:

```go
type GeneratorConfig struct {
… existing fields …
MaxInFlight int
}
```

### `main.go` (modified)

Add to `config`:

```go
type config struct {
… existing fields …
MaxInFlight int `env:"MAX_IN_FLIGHT" envDefault:"200"`
PProfAddr string `env:"PPROF_ADDR" envDefault:""`
}
```

Pass `cfg.MaxInFlight` into `GeneratorConfig` when constructing the generator.

On startup, if `PProfAddr != ""`: register `net/http/pprof` handlers on a
new `http.ServeMux` and start a separate `http.Server` listening on that
addr. Log the resulting URL. The server doesn't share the metrics mux —
pprof is genuinely separate, opt-in infrastructure, and keeping it off the
metrics port avoids accidental exposure when the metrics mux is scraped
by Prometheus.

On `ctx.Done()`: gracefully shut down the pprof server with a 2 s timeout.

### Metrics

No new metrics. The existing `loadgen_publish_errors_total` counter with
`reason="saturated"` is the single new label value for pool saturation.
This keeps the Grafana dashboard's "Publish errors/sec by reason" panel
working out of the box.

## Error handling

- `sem <- struct{}{}` is never blocking because we use non-blocking
`select` — if the pool is full, we record saturation and move on. No
unbounded goroutine growth under sustained overload.
- Inside each publish goroutine, `publishOne` already handles its own
errors (counters for marshal/publish failures, `RecordPublishFailed`
on the Collector).
- Graceful shutdown: the `Run` method returns only after in-flight
publishes drain or the bounded grace period elapses. The caller
(`main.go runRun`) already calls `collector.DiscardBefore` and
`collector.Finalize` after `Run` returns, so late-arriving publishes
correctly integrate with the summary.

## Testing

### New unit test

`TestGenerator_MaxInFlightZeroRunsSerially` — with `MaxInFlight=0`, the
generator's behavior is unchanged from today. Reuses the existing
`TestGenerator_SendsExpectedCount` assertion style.

### Adjusted unit test

`TestGenerator_SendsExpectedCount` — still valid with `MaxInFlight > 0`,
but the count may be closer to the theoretical target since the ticker
is no longer blocked.

### New unit test

`TestGenerator_PoolSaturationCountedAsError` — artificially slow the
publisher via an injected blocking `Publisher`. Run at a rate that
exceeds the pool's capacity. Assert the `saturated` counter increments.

### Integration test

No change. The existing `tools/loadgen/integration_test.go` exercises
`Generator.Run` with a fake gatekeeper + broadcast-worker and makes no
assumptions about ticker coupling.

### Coverage target

`generator.go` to stay at ≥ 90% for `Run`, `publishOne`, `content` per
the existing plan.

## Dependencies

No new third-party dependencies. All new code uses stdlib: `net/http`,
`net/http/pprof`, `sync`.

## Rollout

- Both env vars have safe defaults (`MAX_IN_FLIGHT=200`, `PPROF_ADDR=""`).
- Existing deployments pick up the worker pool automatically with
improved actual-rate fidelity at moderate throughput. Operators
concerned about the behavior change can set `MAX_IN_FLIGHT=0` to
get the legacy serial path.
- pprof stays off unless explicitly enabled via `PPROF_ADDR`.
- Internal-only to the loadgen service; no cross-service contract
change.

## Future work (deferred)

- Dedicated publish-side `*nats.Conn` — only if profiling identifies the
NATS connection write lock as the remaining bottleneck.
- `sync.Pool` for `SendMessageRequest` / `MessageEvent` / byte buffers
to reduce per-publish GC pressure — only if GC shows up in a
profile.
- Background UUID generation — only if `crypto/rand` shows up
prominently.
4 changes: 2 additions & 2 deletions go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -7,13 +7,15 @@ require (
github.com/caarlos0/env/v11 v11.4.0
github.com/coreos/go-oidc/v3 v3.17.0
github.com/docker/docker v27.1.1+incompatible
github.com/elastic/go-elasticsearch/v8 v8.19.3
github.com/gin-gonic/gin v1.12.0
github.com/gocql/gocql v1.7.0
github.com/google/uuid v1.6.0
github.com/nats-io/jwt/v2 v2.8.1
github.com/nats-io/nats-server/v2 v2.12.6
github.com/nats-io/nats.go v1.50.0
github.com/nats-io/nkeys v0.4.15
github.com/prometheus/client_golang v1.23.2
github.com/redis/go-redis/v9 v9.18.0
github.com/stretchr/testify v1.11.1
github.com/testcontainers/testcontainers-go v0.34.0
Expand Down Expand Up @@ -52,7 +54,6 @@ require (
github.com/docker/go-connections v0.5.0 // indirect
github.com/docker/go-units v0.5.0 // indirect
github.com/elastic/elastic-transport-go/v8 v8.8.0 // indirect
github.com/elastic/go-elasticsearch/v8 v8.19.3 // indirect
github.com/felixge/httpsnoop v1.0.4 // indirect
github.com/gabriel-vasile/mimetype v1.4.13 // indirect
github.com/gin-contrib/sse v1.1.1 // indirect
Expand Down Expand Up @@ -94,7 +95,6 @@ require (
github.com/pkg/errors v0.9.1 // indirect
github.com/pmezard/go-difflib v1.0.0 // indirect
github.com/power-devops/perfstat v0.0.0-20210106213030-5aafc221ea8c // indirect
github.com/prometheus/client_golang v1.23.2 // indirect
github.com/prometheus/client_model v0.6.2 // indirect
github.com/prometheus/common v0.67.5 // indirect
github.com/prometheus/otlptranslator v1.0.0 // indirect
Expand Down
12 changes: 0 additions & 12 deletions go.sum
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,6 @@ github.com/Marz32onE/instrumentation-go/otel-nats v0.2.0 h1:J+S/NmcUf+dSXQMzNkNV
github.com/Marz32onE/instrumentation-go/otel-nats v0.2.0/go.mod h1:xgj7JbYX3qHLZ8X7A6Hvc1yeE+t4L+KAgeo9h0JWJ1o=
github.com/Microsoft/go-winio v0.6.2 h1:F2VQgta7ecxGYO8k3ZZz3RS8fVIXVxONVUPlNERoyfY=
github.com/Microsoft/go-winio v0.6.2/go.mod h1:yd8OoFMLzJbo9gZq8j5qaps8bJ9aShtEA8Ipt1oGCvU=
github.com/antithesishq/antithesis-sdk-go v0.4.3-default-no-op h1:+OSa/t11TFhqfrX0EOSqQBDJ0YlpmK0rDSiB19dg9M0=
github.com/antithesishq/antithesis-sdk-go v0.4.3-default-no-op/go.mod h1:IUpT2DPAKh6i/YhSbt6Gl3v2yvUZjmKncl7U91fup7E=
github.com/antithesishq/antithesis-sdk-go v0.6.0-default-no-op h1:kpBdlEPbRvff0mDD1gk7o9BhI16b9p5yYAXRlidpqJE=
github.com/antithesishq/antithesis-sdk-go v0.6.0-default-no-op/go.mod h1:IUpT2DPAKh6i/YhSbt6Gl3v2yvUZjmKncl7U91fup7E=
github.com/beorn7/perks v1.0.1 h1:VlbKKnNfV8bJzeqoa4cOKqO6bYr3WgKZxO8Z16+hsOM=
Expand Down Expand Up @@ -110,8 +108,6 @@ github.com/google/go-cmp v0.5.9/go.mod h1:17dUlkBOakJ0+DkrSSNjCkIjxS6bF9zb3elmeN
github.com/google/go-cmp v0.6.0/go.mod h1:17dUlkBOakJ0+DkrSSNjCkIjxS6bF9zb3elmeNGIjoY=
github.com/google/go-cmp v0.7.0 h1:wk8382ETsv4JYUZwIsn6YpYiWiBsYLSJiTsyBybVuN8=
github.com/google/go-cmp v0.7.0/go.mod h1:pXiqmnSA92OHEEa9HXL2W4E7lf9JzCmGVUdgjX3N/iU=
github.com/google/go-tpm v0.9.3 h1:+yx0/anQuGzi+ssRqeD6WpXjW2L/V0dItUayO0i9sRc=
github.com/google/go-tpm v0.9.3/go.mod h1:h9jEsEECg7gtLis0upRBQU+GhYVH6jMjrFxI8u6bVUY=
github.com/google/go-tpm v0.9.8 h1:slArAR9Ft+1ybZu0lBwpSmpwhRXaa85hWtMinMyRAWo=
github.com/google/go-tpm v0.9.8/go.mod h1:h9jEsEECg7gtLis0upRBQU+GhYVH6jMjrFxI8u6bVUY=
github.com/google/gofuzz v1.0.0/go.mod h1:dBl0BpW6vV/+mYPU4Po3pmUjxk6FQPldtuIdl/M65Eg=
Expand Down Expand Up @@ -146,8 +142,6 @@ github.com/magiconair/properties v1.8.7 h1:IeQXZAiQcpL9mgcAe1Nu6cX9LLw6ExEHKjN0V
github.com/magiconair/properties v1.8.7/go.mod h1:Dhd985XPs7jluiymwWYZ0G4Z61jb3vdS329zhj2hYo0=
github.com/mattn/go-isatty v0.0.20 h1:xfD0iDuEKnDkl03q4limB+vH+GxLEtL/jb4xVJSWWEY=
github.com/mattn/go-isatty v0.0.20/go.mod h1:W+V8PltTTMOvKvAeJH7IuucS94S2C6jfK/D7dTCTo3Y=
github.com/minio/highwayhash v1.0.3 h1:kbnuUMoHYyVl7szWjSxJnxw11k2U709jqFPPmIUyD6Q=
github.com/minio/highwayhash v1.0.3/go.mod h1:GGYsuwP/fPD6Y9hMiXuapVvlIUEhFhMTh0rxU3ik1LQ=
github.com/minio/highwayhash v1.0.4-0.20251030100505-070ab1a87a76 h1:KGuD/pM2JpL9FAYvBrnBBeENKZNh6eNtjqytV6TYjnk=
github.com/minio/highwayhash v1.0.4-0.20251030100505-070ab1a87a76/go.mod h1:GGYsuwP/fPD6Y9hMiXuapVvlIUEhFhMTh0rxU3ik1LQ=
github.com/moby/docker-image-spec v1.3.1 h1:jMKff3w6PgbfSa69GfNg+zN/XLhfXJGnEx3Nl2EsFP0=
Expand All @@ -173,8 +167,6 @@ github.com/munnerz/goautoneg v0.0.0-20191010083416-a7dc8b61c822 h1:C3w9PqII01/Oq
github.com/munnerz/goautoneg v0.0.0-20191010083416-a7dc8b61c822/go.mod h1:+n7T8mK8HuQTcFwEeznm/DIxMOiR9yIdICNftLE1DvQ=
github.com/nats-io/jwt/v2 v2.8.1 h1:V0xpGuD/N8Mi+fQNDynXohVvp7ZztevW5io8CUWlPmU=
github.com/nats-io/jwt/v2 v2.8.1/go.mod h1:nWnOEEiVMiKHQpnAy4eXlizVEtSfzacZ1Q43LIRavZg=
github.com/nats-io/nats-server/v2 v2.11.0 h1:fdwAT1d6DZW/4LUz5rkvQUe5leGEwjjOQYntzVRKvjE=
github.com/nats-io/nats-server/v2 v2.11.0/go.mod h1:leXySghbdtXSUmWem8K9McnJ6xbJOb0t9+NQ5HTRZjI=
github.com/nats-io/nats-server/v2 v2.12.6 h1:Egbx9Vl7Ch8wTtpXPGqbehkZ+IncKqShUxvrt1+Enc8=
github.com/nats-io/nats-server/v2 v2.12.6/go.mod h1:4HPlrvtmSO3yd7KcElDNMx9kv5EBJBnJJzQPptXlheo=
github.com/nats-io/nats.go v1.50.0 h1:5zAeQrTvyrKrWLJ0fu02W3br8ym57qf7csDzgLOpcds=
Expand Down Expand Up @@ -277,8 +269,6 @@ go.opentelemetry.io/otel/exporters/otlp/otlptrace v1.43.0 h1:88Y4s2C8oTui1LGM6bT
go.opentelemetry.io/otel/exporters/otlp/otlptrace v1.43.0/go.mod h1:Vl1/iaggsuRlrHf/hfPJPvVag77kKyvrLeD10kpMl+A=
go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc v1.43.0 h1:RAE+JPfvEmvy+0LzyUA25/SGawPwIUbZ6u0Wug54sLc=
go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc v1.43.0/go.mod h1:AGmbycVGEsRx9mXMZ75CsOyhSP6MFIcj/6dnG+vhVjk=
go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp v1.19.0 h1:IeMeyr1aBvBiPVYihXIaeIZba6b8E1bYp7lbdxK8CQg=
go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp v1.19.0/go.mod h1:oVdCUtjq9MK9BlS7TtucsQwUcXcymNiEDjgDD2jMtZU=
go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp v1.43.0 h1:3iZJKlCZufyRzPzlQhUIWVmfltrXuGyfjREgGP3UUjc=
go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp v1.43.0/go.mod h1:/G+nUPfhq2e+qiXMGxMwumDrP5jtzU+mWN7/sjT2rak=
go.opentelemetry.io/otel/exporters/prometheus v0.65.0 h1:jOveH/b4lU9HT7y+Gfamf18BqlOuz2PWEvs8yM7Q6XE=
Expand Down Expand Up @@ -356,8 +346,6 @@ golang.org/x/text v0.3.7/go.mod h1:u+2+/6zg+i71rQMx5EYifcz6MCKuco9NR6JIITiCfzQ=
golang.org/x/text v0.3.8/go.mod h1:E6s5w1FMmriuDzIBO73fBruAKo1PCIq6d2Q6DHfQ8WQ=
golang.org/x/text v0.35.0 h1:JOVx6vVDFokkpaq1AEptVzLTpDe9KGpj5tR4/X+ybL8=
golang.org/x/text v0.35.0/go.mod h1:khi/HExzZJ2pGnjenulevKNX1W67CUy0AsXcNubPGCA=
golang.org/x/time v0.11.0 h1:/bpjEDfN9tkoN/ryeYHnv5hcMlc8ncjMcM4XBk5NWV0=
golang.org/x/time v0.11.0/go.mod h1:CDIdPxbZBQxdj6cxyCIdrNogrJKMJ7pr37NYpMcMDSg=
golang.org/x/time v0.15.0 h1:bbrp8t3bGUeFOx08pvsMYRTCVSMk89u4tKbNOZbp88U=
golang.org/x/time v0.15.0/go.mod h1:Y4YMaQmXwGQZoFaVFk4YpCt4FLQMYKZe9oeV/f4MSno=
golang.org/x/tools v0.0.0-20180917221912-90fa682c2a6e/go.mod h1:n7NCudcB/nEzxVGmLbDWY5pfWTLqBcC2KZ6jyYvM4mQ=
Expand Down
12 changes: 12 additions & 0 deletions pkg/subject/subject.go
Original file line number Diff line number Diff line change
Expand Up @@ -177,6 +177,18 @@ func RoomsGetWildcard() string {
return "chat.user.*.request.rooms.get.*"
}

func UserResponseWildcard() string {
return "chat.user.*.response.>"
}

func RoomEventWildcard() string {
return "chat.room.*.event"
}

func UserRoomEventWildcard() string {
return "chat.user.*.event.room"
}
Comment on lines +254 to +264
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Add unit tests for the new exported helpers.

As per coding guidelines: "Every exported function in pkg/ must have corresponding test cases." Please add tests for UserResponseWildcard, RoomEventWildcard, and UserRoomEventWildcard in pkg/subject/subject_test.go asserting the exact returned strings.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/subject/subject.go` around lines 180 - 190, Add unit tests for the three
new exported helpers: create tests in the package test file that call
UserResponseWildcard, RoomEventWildcard, and UserRoomEventWildcard and assert
they return exactly "chat.user.*.response.>", "chat.room.*.event", and
"chat.user.*.event.room" respectively; implement either three small Test...
functions or a table-driven TestWildcards that uses testing.T and
t.Fatalf/t.Errorf to fail on mismatches, referencing the functions by name so
the test imports the same package and verifies the exact string values.


// --- natsrouter patterns (use {param} placeholders for named extraction) ---

func MsgHistoryPattern(siteID string) string {
Expand Down
59 changes: 59 additions & 0 deletions tools/loadgen/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
# loadgen

Capacity-baseline load generator for the single-site messaging pipeline
(`message-gatekeeper` → `MESSAGES_CANONICAL` → `message-worker` +
`broadcast-worker`). Single Go binary with three subcommands.

## Quick start

```
make -C tools/loadgen/deploy up
make -C tools/loadgen/deploy seed PRESET=medium
make -C tools/loadgen/deploy run PRESET=medium RATE=500 DURATION=60s
```

For live dashboards:

```
make -C tools/loadgen/deploy run-dashboards PRESET=medium
# Grafana at http://localhost:3000 (anonymous admin)
```

Tear down:

```
make -C tools/loadgen/deploy down
```

## Presets

| preset | users | rooms | notes |
|-------------|--------|-------|--------------------------------------------------------|
| `small` | 10 | 5 | uniform, 200-byte content |
| `medium` | 1 000 | 100 | uniform, 200-byte content |
| `large` | 10 000 | 1 000 | uniform, 200-byte content |
| `realistic` | 1 000 | 100 | Zipf senders, mixed room sizes, 50–2000 bytes, mentions|

## Subcommands

- `loadgen seed --preset=<name> [--seed=42]` — idempotently populate
MongoDB with deterministic fixtures.
- `loadgen run --preset=<name> [flags]` — open-loop publish at `--rate`
msgs/sec for `--duration`, print a summary at the end. Flags:
`--seed`, `--warmup`, `--inject=frontdoor|canonical`, `--csv=<path>`.
- `loadgen teardown` — drop the three seeded collections.

## Reading the summary

- `final_pending == 0` on both durables, zero errors → the pipeline is
sustaining your target rate.
- `final_pending` climbing, or error counts > 0 → over capacity or a
regression upstream of the worker.

## Non-goals

- Not a CI regression gate. Invoked manually.
- Not an auth benchmark. Uses shared `backend.creds`.
- Not a cross-site benchmark. Single-site only.
- Not an absolute-number tool. Numbers vary by host — compare within one
machine across changes, don't compare across machines.
Loading