Skip to content

Commit 44d1643

Browse files
committed
Merge origin/main: adopt Kartik's metrics watcher as Exercise 4
Pulls in Kartik's 13 commits from `main`. The substantive addition is a complete Go metrics watcher at `exercises/04_go_agent/`: a worker that joins the tailnet via `tsnet`, scrapes `node_exporter` metrics off a separate tailnet node, and runs a Temporal Schedule that asks Claude via Aperture for a plain-English health report. This replaces the stubbed "same weather agent in Go" placeholder that was on `init`. The rest of the merge is a reframe: slides, docs, and the workshop overview now describe Exercise 4 as a tsnet + Schedules + Claude exercise instead of a Python-to-Go translation. Also ports Kartik's `temporal-ts-net` install changes (the prebuilt curl one-liner, no Go toolchain required) into `docs/infrastructure.md`, since his diffs landed on files that don't exist on `init` (`instructor/`). Conflict resolutions: - **`exercises/04_go_agent/practice/{main.go, workflow.go, activities.go, go.mod, go.sum}`** and **`exercises/04_go_agent/README.md`**: took Kartik's versions wholesale. Two new test files (`activities_test.go`, `workflow_test.go`) came along as additions - **`.gitignore`**: unioned both sides. Kept the Slidev, MkDocs, Playwright, and `commit-msg.md`/`.ai-sessions/` entries from init; added Kartik's Go build-artifact catch-alls (`/bin/`, `*.exe`, `*.test`, `*.out`, the Ex4 binary), his tsnet-state directories (`/lab-worker/`, `**/workshop-tsnet/`), and his auth-key patterns (`*.env`, `tskey-*`) - **`README.md`**: kept the init-side Ex3 timing (15 min) and the community-guide footer. Adopted Kartik's Ex4 row verbatim, updating "Stretch" to "15 min" to match the promoted in-session slot Reframe edits folded into the merge commit so the repo is never in the intermediate state where code says "metrics watcher" but slides say "weather agent in Go": - `slides/slides.md`: section title becomes "temporal-ts-net and Metrics Watcher". New "Your Worker Can Join the Tailnet Too" slide shows `tsnet` from the client side (Dial, not Listen). Replaces the "Go Agent Preview" slide with an "Exercise 4: Metrics Watcher" preview showing `HealthCheckWorkflow`. Adds a proper `layout: exercise` card for Ex4 (15 min). Wrap-up "What We Built" table broadened: Temporal row mentions Schedules, Tailscale row mentions tsnet in workers, AI row covers both OpenAI+Python and Claude+Go - `slides/theme-temporal/components/WorkshopToc.vue`: `tsnet` row relabelled to "temporal-ts-net and Metrics Watcher" - `docs/workshop-overview.md`: Ex4 line reframed from "translates the weather agent to Go" to the metrics watcher description - `docs/architecture.md`: topology diagram gains the `metrics-server` node and the Anthropic upstream; Aperture prose broadened from "sits between attendee code and OpenAI" to "between attendee code and the LLM providers" - `docs/infrastructure.md`: drops the "Install Go 1.26+" section entirely and replaces the `temporal-ts-net` build with the `install.sh` curl one-liner (port of Kartik's `2bedf4e`). The installer handles `amd64`/`arm64` automatically, so his arm64 note folds into a one-liner - `exercises/04_go_agent/README.md` and `exercises/04_go_agent/practice/activities.go`: em-dash sweep to match the keyboard-only rule applied to the rest of the repo. Six prose em-dashes in the README, three in the Claude prompt in `AnalyzeMetrics`. Two "no default" table cells switched from em-dash to `(none)`
2 parents deee022 + 77c4ba5 commit 44d1643

15 files changed

Lines changed: 921 additions & 183 deletions

File tree

.gitignore

Lines changed: 15 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -207,11 +207,24 @@ marimo/_lsp/
207207
__marimo__/
208208

209209
# Go
210-
exercises/04_go_agent/practice/go-agent
211-
exercises/04_go_agent/solution/go-agent
210+
/bin/
211+
*.exe
212+
*.test
213+
*.out
214+
exercises/04_go_agent/practice/04_go_agent
215+
exercises/04_go_agent/practice/practice
212216
exercises/02_explore_tailscale/go-hello-tsnet/practice/practice
213217
exercises/02_explore_tailscale/go-hello-tsnet/solution/solution
214218

219+
# Tailscale tsnet state (contains auth keys / node private keys)
220+
/lab-worker/
221+
**/lab-worker/
222+
**/workshop-tsnet/
223+
224+
# Auth keys / secrets
225+
*.env
226+
tskey-*
227+
215228
# Node (Slidev)
216229
node_modules/
217230
slides/node_modules/

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -47,7 +47,7 @@ Then start with [Exercise 1](exercises/01_hello_tailnet/README.md).
4747
| 1 | [Hello Tailnet](exercises/01_hello_tailnet/README.md) | 15 min | Run a geo-IP workflow on the shared Temporal server via Tailscale |
4848
| 2 | [Explore Tailscale](exercises/02_explore_tailscale/README.md) | 15 min | Discover your network, understand Aperture, run a Go worker over `tsnet` |
4949
| 3 | [Weather Agent](exercises/03_weather_agent/README.md) | 15 min | Build a durable AI agent with LLM calls routed through Aperture |
50-
| 4 | [Go Agent](exercises/04_go_agent/README.md) | 15 min | The same weather agent, in Go |
50+
| 4 | [Metrics Watcher](exercises/04_go_agent/README.md) | 15 min | Schedule a Go tsnet worker to fetch metrics from a tailnet node and summarize them with Claude |
5151

5252
## Temporal Web UI
5353

docs/architecture.md

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,10 +14,14 @@ flowchart LR
1414
TS["Temporal Dev Server<br/>temporal-dev:7233<br/>temporal-dev:8233"]
1515
AP["Aperture<br/>API Gateway"]
1616
end
17+
MS["metrics-server<br/>node_exporter:9100"]
1718
OAI["OpenAI API<br/>(shared key)"]
19+
CLA["Anthropic API<br/>(shared key)"]
1820
VM <-. Tailnet .-> TS
1921
VM <-. Tailnet .-> AP
22+
VM <-. Tailnet .-> MS
2023
AP --> OAI
24+
AP --> CLA
2125
```
2226

2327
Everything between the attendee machine and the shared infrastructure rides on an encrypted Tailscale mesh. There is no public port open on the VPS. If you're not on the tailnet, you can't reach it.
@@ -38,7 +42,7 @@ A Temporal CLI extension that runs the dev server and joins the tailnet via [tsn
3842

3943
### Aperture (API gateway)
4044

41-
Sits between attendee code and OpenAI. Holds the real API key, forwards requests to `api.openai.com`, and enforces per-identity rate limits using the caller's Tailscale identity. Attendees never see the key, and one person running 500 agents can't burn the whole budget.
45+
Sits between attendee code and the LLM providers. Holds the real API keys (OpenAI for the Python weather agent, Anthropic for the Go metrics watcher), forwards requests upstream, and enforces per-identity rate limits using the caller's Tailscale identity. Attendees never see the keys, and one person running 500 agents can't burn the whole budget.
4246

4347
## The two agent patterns
4448

docs/infrastructure.md

Lines changed: 5 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -21,19 +21,6 @@ The workshop needs one long-lived VPS that runs `temporal-ts-net` and joins the
2121

2222
If you're building something that runs for longer than a workshop or exposes the server to arbitrary traffic, switch to a real deployment before you hit the limits.
2323

24-
### Install Go 1.26+
25-
26-
`temporal-ts-net` currently requires Go 1.26.1+ (tsnet dependency). On `amd64`:
27-
28-
```shell
29-
wget https://go.dev/dl/go1.26.2.linux-amd64.tar.gz
30-
sudo tar -C /usr/local -xzf go1.26.2.linux-amd64.tar.gz
31-
echo 'export PATH=$PATH:/usr/local/go/bin:$HOME/go/bin' >> ~/.bashrc
32-
source ~/.bashrc
33-
```
34-
35-
On `arm64` (for example, Hetzner ARM), swap `linux-amd64` for `linux-arm64`.
36-
3724
### Install the Temporal CLI
3825

3926
```shell
@@ -45,14 +32,15 @@ temporal --version # must be v1.6.0+ for extension support
4532

4633
### Install `temporal-ts-net`
4734

35+
A prebuilt binary for your architecture, no Go toolchain required:
36+
4837
```shell
49-
git clone https://github.com/temporal-community/temporal-ts-net
50-
cd temporal-ts-net
51-
go install ./cmd/temporal-ts_net
52-
cd ..
38+
curl -sSfL https://raw.githubusercontent.com/temporal-community/temporal-ts-net/main/install.sh | sh
5339
temporal help --all | grep ts-net # verify the extension is found
5440
```
5541

42+
The installer picks the right binary for `amd64` or `arm64`.
43+
5644
### Get a Tailscale auth key for the server
5745

5846
In the Tailscale admin console, under **Settings > Keys**:

docs/workshop-overview.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -33,10 +33,10 @@ Every LLM call goes through Aperture, which holds the real OpenAI key and applie
3333
| AI agents on Temporal (talk) | 10 min |
3434
| Exercise 3 - Weather agent | 15 min |
3535
| Rate-limit demo (everyone fires at once) | 5 min |
36-
| Exercise 4 (Go agent) + `temporal-ts-net` walk-through | 15 min |
36+
| Exercise 4 - Metrics watcher + `temporal-ts-net` walk-through | 15 min |
3737
| Wrap-up + Q&A | 5 min |
3838

39-
Exercise 4 translates the weather agent to Go, reusing the same Temporal server, Aperture endpoint, and tailnet.
39+
Exercise 4 is a finished Go worker that joins the tailnet via `tsnet`, pulls `node_exporter` metrics from another tailnet node, and runs a Temporal Schedule that asks Claude via Aperture for a plain-English health report. Attendees run it, tune the interval, and tweak the prompt.
4040

4141
## What attendees need
4242

exercises/04_go_agent/README.md

Lines changed: 102 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -1,51 +1,123 @@
1-
# Exercise 4: Go Agent (Stretch Goal)
1+
# Exercise 4: Metrics Watcher
22

3-
The same weather agent pattern, implemented in Go.
3+
A finished Go worker that joins the tailnet via `tsnet`, scrapes
4+
`node_exporter` metrics from a pre-provisioned `metrics-server` VM, and
5+
asks Claude (via Aperture) for a plain-English health summary on a
6+
schedule.
47

58
## Goal
69

7-
Port the Python agentic loop to Go, demonstrating that Temporal + Tailscale + Aperture work across languages. The same shared Temporal server, the same Aperture endpoint, different language.
10+
Run the Exercise 2 `tsnet` pattern against real services: pull
11+
`node_exporter` metrics off the tailnet, summarize them with Claude
12+
via Aperture, and schedule the whole thing with Temporal. The code is
13+
complete. Run it, tune the cadence, watch the runs in the Temporal UI.
814

9-
## Status
15+
## Background
1016

11-
This exercise is a **take-home stretch goal**. The Go files are stubbed out with TODOs describing what each function should do. Use the Python implementation in Exercise 3 as your reference.
17+
### Topology
1218

13-
## Architecture
19+
```mermaid
20+
flowchart LR
21+
VM[Your Instruqt VM<br/>Go worker]
22+
TS[Temporal Dev Server<br/>temporal-dev:7233 / :8233]
23+
MS[metrics-server<br/>node_exporter :9100]
24+
AP[Aperture<br/>API Gateway]
25+
VM <-. Tailnet .-> TS
26+
VM <-. Tailnet .-> MS
27+
VM <-. Tailnet .-> AP
28+
```
1429

15-
The Go agent follows the same pattern as the Python version:
30+
### What's different from Exercise 2
1631

17-
1. `CreateCompletion` activity calls the OpenAI API through Aperture
18-
2. The workflow loops: ask the LLM → execute chosen tool → feed result back
19-
3. Tool activities (`GetWeatherAlerts`, `GetIPAddress`, `GetLocationInfo`) call the same public APIs
20-
4. Worker connects to the shared Temporal server on the tailnet
32+
- **Temporal Schedule** with `TriggerImmediately`: fires once on start, then every `HEALTH_CHECK_INTERVAL` (default `10m`). Durable on the server.
33+
- Data comes from another tailnet node (`metrics-server:9100`), not the public internet.
34+
- LLM call goes to **Claude via Aperture**: same gateway as Exercise 3, different vendor.
35+
- Returns a structured `HealthReport` instead of a string, so the Temporal UI renders each field cleanly.
2136

22-
The key difference: Go doesn't have an official OpenAI SDK integrated with Temporal, so you'll make HTTP requests directly to the Aperture endpoint (which is OpenAI-compatible).
37+
### What's already built for you
2338

24-
## Files
39+
- `main.go`: joins the tailnet via `tsnet`, dials Temporal, creates the Schedule.
40+
- `activities.go`: `FetchMetrics` and `AnalyzeMetrics` (returns `HealthReport`).
41+
- `workflow.go`: `HealthCheckWorkflow` chains the two activities.
42+
- Tests: run offline with `go test ./...`.
2543

26-
| File | What to implement |
27-
|------|-------------------|
28-
| `practice/main.go` | Worker setup and workflow starter |
29-
| `practice/workflow.go` | Agentic loop workflow |
30-
| `practice/activities.go` | OpenAI API call + tool implementations |
44+
## Run it
3145

32-
## Getting Started
46+
### Step 1: Go to the practice directory
3347

3448
```bash
3549
cd exercises/04_go_agent/practice
36-
go mod tidy
37-
go run . # Start the worker
38-
go run . run "What's the weather like where I am?" # Run a workflow
50+
go mod download
51+
```
52+
53+
### Step 2: Start the worker
54+
55+
```bash
56+
WORKSHOP_USER_ID=$WORKSHOP_USER_ID \
57+
TS_AUTHKEY=tskey-auth-<your-key> \
58+
METRICS_URL=http://metrics-server:9100/metrics \
59+
go run .
60+
```
61+
62+
First run takes 10-30 seconds while `tsnet` registers the node. After that:
63+
64+
```
65+
level=INFO msg="joined tailnet" hostname=<you>-metrics-worker userID=<you>
66+
level=INFO msg="connected to temporal" host=temporal-dev:7233
67+
level=INFO msg="metrics reachable" url=http://metrics-server:9100/metrics
68+
level=INFO msg="created schedule" id=<you>-health-check-schedule interval=10m0s workflow=<you>-health-check
69+
```
70+
71+
The schedule fires immediately. You'll see a completed workflow in the Temporal UI within seconds.
72+
73+
### Step 3: Watch it in the Temporal UI
74+
75+
Open `http://temporal-dev:8233`. Two places to look:
76+
77+
- **Schedules**: click `<you>-health-check-schedule` to see the interval, next fire time, and recent fires.
78+
- **Workflows**: search for `<you>-health-check`. Each completed row (ID is suffixed with the schedule fire time) has the `HealthReport` in its Result panel.
79+
80+
### Step 4: Tune the cadence
81+
82+
10m is too slow to watch during the workshop. `Ctrl+C`, restart with a shorter interval:
83+
84+
```bash
85+
HEALTH_CHECK_INTERVAL=2m \
86+
WORKSHOP_USER_ID=$WORKSHOP_USER_ID \
87+
TS_AUTHKEY=tskey-auth-<your-key> \
88+
METRICS_URL=http://metrics-server:9100/metrics \
89+
go run .
3990
```
4091

41-
## Hints
92+
Any Go duration (`30s`, `5m`, `1h`). The worker recreates the schedule on startup, so restarting just takes effect.
93+
94+
### Step 5: Customize the Claude prompt
95+
96+
Open `activities.go`, find `AnalyzeMetrics`. Change the prompt: request a different field, flag anything unusual, whatever. Restart the worker and watch the `HealthReport` change on the next fire.
97+
98+
## Environment variables
99+
100+
| Variable | Required | Default | Description |
101+
|-------------------------|----------|-----------------------|---------------------------------------------------------------------|
102+
| `TS_AUTHKEY` | yes* | (none) | Tailscale auth key. Required on first run; tsnet reuses state after.|
103+
| `METRICS_URL` | yes | (none) | `node_exporter` endpoint on the tailnet. |
104+
| `WORKSHOP_USER_ID` | no | `lab` | Prefixes hostname, task queue, and workflow ID. |
105+
| `HEALTH_CHECK_INTERVAL` | no | `10m` | Cadence as a Go duration (`30s`, `5m`, `1h`). |
106+
| `TEMPORAL_HOST` | no | `temporal-dev:7233` | Temporal server address. |
107+
| `AI_URL` | no | `http://ai` | Aperture endpoint. |
108+
| `AI_MODEL` | no | `claude-haiku-4-5` | Claude model. |
109+
110+
## Run the tests
111+
112+
```bash
113+
go test ./...
114+
```
42115

43-
- The OpenAI Responses API endpoint is `POST /v1/responses`
44-
- Set the `Authorization` header to `Bearer $OPENAI_API_KEY`
45-
- Use `encoding/json` to marshal/unmarshal request/response bodies
46-
- The `base_url` should come from `OPENAI_BASE_URL` environment variable
47-
- Look at `exercises/03_weather_agent/solution/` for the complete Python reference
116+
Tests mock `node_exporter` and Aperture with `httptest.Server`, no tailnet needed.
48117

49-
## Solution
118+
## What You've Learned
50119

51-
The solution will be added in a future update. For now, use the Python implementation as your guide.
120+
- `tsnet.Dial` works for both tailnet-internal HTTP and gRPC
121+
- Aperture is model-agnostic: Anthropic here, OpenAI in Exercise 3
122+
- Temporal Schedules with `TriggerImmediately` fire now, then every N, with the next fire visible in the UI
123+
- All three backing services are tailnet-only; Tailscale identity is the auth layer

0 commit comments

Comments
 (0)