|
1 | | -# Exercise 4: Go Agent (Stretch Goal) |
| 1 | +# Exercise 4: Metrics Watcher |
2 | 2 |
|
3 | | -The same weather agent pattern, implemented in Go. |
| 3 | +A finished Go worker that joins the tailnet via `tsnet`, scrapes |
| 4 | +`node_exporter` metrics from a pre-provisioned `metrics-server` VM, and |
| 5 | +asks Claude (via Aperture) for a plain-English health summary on a |
| 6 | +schedule. |
4 | 7 |
|
5 | 8 | ## Goal |
6 | 9 |
|
7 | | -Port the Python agentic loop to Go, demonstrating that Temporal + Tailscale + Aperture work across languages. The same shared Temporal server, the same Aperture endpoint, different language. |
| 10 | +Run the Exercise 2 `tsnet` pattern against real services: pull |
| 11 | +`node_exporter` metrics off the tailnet, summarize them with Claude |
| 12 | +via Aperture, and schedule the whole thing with Temporal. The code is |
| 13 | +complete. Run it, tune the cadence, watch the runs in the Temporal UI. |
8 | 14 |
|
9 | | -## Status |
| 15 | +## Background |
10 | 16 |
|
11 | | -This exercise is a **take-home stretch goal**. The Go files are stubbed out with TODOs describing what each function should do. Use the Python implementation in Exercise 3 as your reference. |
| 17 | +### Topology |
12 | 18 |
|
13 | | -## Architecture |
| 19 | +```mermaid |
| 20 | +flowchart LR |
| 21 | + VM[Your Instruqt VM<br/>Go worker] |
| 22 | + TS[Temporal Dev Server<br/>temporal-dev:7233 / :8233] |
| 23 | + MS[metrics-server<br/>node_exporter :9100] |
| 24 | + AP[Aperture<br/>API Gateway] |
| 25 | + VM <-. Tailnet .-> TS |
| 26 | + VM <-. Tailnet .-> MS |
| 27 | + VM <-. Tailnet .-> AP |
| 28 | +``` |
14 | 29 |
|
15 | | -The Go agent follows the same pattern as the Python version: |
| 30 | +### What's different from Exercise 2 |
16 | 31 |
|
17 | | -1. `CreateCompletion` activity calls the OpenAI API through Aperture |
18 | | -2. The workflow loops: ask the LLM → execute chosen tool → feed result back |
19 | | -3. Tool activities (`GetWeatherAlerts`, `GetIPAddress`, `GetLocationInfo`) call the same public APIs |
20 | | -4. Worker connects to the shared Temporal server on the tailnet |
| 32 | +- **Temporal Schedule** with `TriggerImmediately`: fires once on start, then every `HEALTH_CHECK_INTERVAL` (default `10m`). Durable on the server. |
| 33 | +- Data comes from another tailnet node (`metrics-server:9100`), not the public internet. |
| 34 | +- LLM call goes to **Claude via Aperture**: same gateway as Exercise 3, different vendor. |
| 35 | +- Returns a structured `HealthReport` instead of a string, so the Temporal UI renders each field cleanly. |
21 | 36 |
|
22 | | -The key difference: Go doesn't have an official OpenAI SDK integrated with Temporal, so you'll make HTTP requests directly to the Aperture endpoint (which is OpenAI-compatible). |
| 37 | +### What's already built for you |
23 | 38 |
|
24 | | -## Files |
| 39 | +- `main.go`: joins the tailnet via `tsnet`, dials Temporal, creates the Schedule. |
| 40 | +- `activities.go`: `FetchMetrics` and `AnalyzeMetrics` (returns `HealthReport`). |
| 41 | +- `workflow.go`: `HealthCheckWorkflow` chains the two activities. |
| 42 | +- Tests: run offline with `go test ./...`. |
25 | 43 |
|
26 | | -| File | What to implement | |
27 | | -|------|-------------------| |
28 | | -| `practice/main.go` | Worker setup and workflow starter | |
29 | | -| `practice/workflow.go` | Agentic loop workflow | |
30 | | -| `practice/activities.go` | OpenAI API call + tool implementations | |
| 44 | +## Run it |
31 | 45 |
|
32 | | -## Getting Started |
| 46 | +### Step 1: Go to the practice directory |
33 | 47 |
|
34 | 48 | ```bash |
35 | 49 | cd exercises/04_go_agent/practice |
36 | | -go mod tidy |
37 | | -go run . # Start the worker |
38 | | -go run . run "What's the weather like where I am?" # Run a workflow |
| 50 | +go mod download |
| 51 | +``` |
| 52 | + |
| 53 | +### Step 2: Start the worker |
| 54 | + |
| 55 | +```bash |
| 56 | +WORKSHOP_USER_ID=$WORKSHOP_USER_ID \ |
| 57 | +TS_AUTHKEY=tskey-auth-<your-key> \ |
| 58 | +METRICS_URL=http://metrics-server:9100/metrics \ |
| 59 | +go run . |
| 60 | +``` |
| 61 | + |
| 62 | +First run takes 10-30 seconds while `tsnet` registers the node. After that: |
| 63 | + |
| 64 | +``` |
| 65 | +level=INFO msg="joined tailnet" hostname=<you>-metrics-worker userID=<you> |
| 66 | +level=INFO msg="connected to temporal" host=temporal-dev:7233 |
| 67 | +level=INFO msg="metrics reachable" url=http://metrics-server:9100/metrics |
| 68 | +level=INFO msg="created schedule" id=<you>-health-check-schedule interval=10m0s workflow=<you>-health-check |
| 69 | +``` |
| 70 | + |
| 71 | +The schedule fires immediately. You'll see a completed workflow in the Temporal UI within seconds. |
| 72 | + |
| 73 | +### Step 3: Watch it in the Temporal UI |
| 74 | + |
| 75 | +Open `http://temporal-dev:8233`. Two places to look: |
| 76 | + |
| 77 | +- **Schedules**: click `<you>-health-check-schedule` to see the interval, next fire time, and recent fires. |
| 78 | +- **Workflows**: search for `<you>-health-check`. Each completed row (ID is suffixed with the schedule fire time) has the `HealthReport` in its Result panel. |
| 79 | + |
| 80 | +### Step 4: Tune the cadence |
| 81 | + |
| 82 | +10m is too slow to watch during the workshop. `Ctrl+C`, restart with a shorter interval: |
| 83 | + |
| 84 | +```bash |
| 85 | +HEALTH_CHECK_INTERVAL=2m \ |
| 86 | +WORKSHOP_USER_ID=$WORKSHOP_USER_ID \ |
| 87 | +TS_AUTHKEY=tskey-auth-<your-key> \ |
| 88 | +METRICS_URL=http://metrics-server:9100/metrics \ |
| 89 | +go run . |
39 | 90 | ``` |
40 | 91 |
|
41 | | -## Hints |
| 92 | +Any Go duration (`30s`, `5m`, `1h`). The worker recreates the schedule on startup, so restarting just takes effect. |
| 93 | + |
| 94 | +### Step 5: Customize the Claude prompt |
| 95 | + |
| 96 | +Open `activities.go`, find `AnalyzeMetrics`. Change the prompt: request a different field, flag anything unusual, whatever. Restart the worker and watch the `HealthReport` change on the next fire. |
| 97 | + |
| 98 | +## Environment variables |
| 99 | + |
| 100 | +| Variable | Required | Default | Description | |
| 101 | +|-------------------------|----------|-----------------------|---------------------------------------------------------------------| |
| 102 | +| `TS_AUTHKEY` | yes* | (none) | Tailscale auth key. Required on first run; tsnet reuses state after.| |
| 103 | +| `METRICS_URL` | yes | (none) | `node_exporter` endpoint on the tailnet. | |
| 104 | +| `WORKSHOP_USER_ID` | no | `lab` | Prefixes hostname, task queue, and workflow ID. | |
| 105 | +| `HEALTH_CHECK_INTERVAL` | no | `10m` | Cadence as a Go duration (`30s`, `5m`, `1h`). | |
| 106 | +| `TEMPORAL_HOST` | no | `temporal-dev:7233` | Temporal server address. | |
| 107 | +| `AI_URL` | no | `http://ai` | Aperture endpoint. | |
| 108 | +| `AI_MODEL` | no | `claude-haiku-4-5` | Claude model. | |
| 109 | + |
| 110 | +## Run the tests |
| 111 | + |
| 112 | +```bash |
| 113 | +go test ./... |
| 114 | +``` |
42 | 115 |
|
43 | | -- The OpenAI Responses API endpoint is `POST /v1/responses` |
44 | | -- Set the `Authorization` header to `Bearer $OPENAI_API_KEY` |
45 | | -- Use `encoding/json` to marshal/unmarshal request/response bodies |
46 | | -- The `base_url` should come from `OPENAI_BASE_URL` environment variable |
47 | | -- Look at `exercises/03_weather_agent/solution/` for the complete Python reference |
| 116 | +Tests mock `node_exporter` and Aperture with `httptest.Server`, no tailnet needed. |
48 | 117 |
|
49 | | -## Solution |
| 118 | +## What You've Learned |
50 | 119 |
|
51 | | -The solution will be added in a future update. For now, use the Python implementation as your guide. |
| 120 | +- `tsnet.Dial` works for both tailnet-internal HTTP and gRPC |
| 121 | +- Aperture is model-agnostic: Anthropic here, OpenAI in Exercise 3 |
| 122 | +- Temporal Schedules with `TriggerImmediately` fire now, then every N, with the next fire visible in the UI |
| 123 | +- All three backing services are tailnet-only; Tailscale identity is the auth layer |
0 commit comments