|
| 1 | +# Architecture |
| 2 | + |
| 3 | +Three layers, one workshop. |
| 4 | + |
| 5 | +## Topology |
| 6 | + |
| 7 | +```mermaid |
| 8 | +flowchart LR |
| 9 | + subgraph VM["Attendee VM (Instruqt or local)"] |
| 10 | + PY[Python worker] |
| 11 | + GO[Go worker] |
| 12 | + end |
| 13 | + subgraph VPS["Shared VPS"] |
| 14 | + TS["Temporal Dev Server<br/>temporal-dev:7233<br/>temporal-dev:8233"] |
| 15 | + AP["Aperture<br/>API Gateway"] |
| 16 | + end |
| 17 | + OAI["OpenAI API<br/>(shared key)"] |
| 18 | + VM <-. Tailnet .-> TS |
| 19 | + VM <-. Tailnet .-> AP |
| 20 | + AP --> OAI |
| 21 | +``` |
| 22 | + |
| 23 | +Everything between the attendee machine and the shared infrastructure rides on an encrypted Tailscale mesh. There is no public port open on the VPS. If you're not on the tailnet, you can't reach it. |
| 24 | + |
| 25 | +## The four pieces |
| 26 | + |
| 27 | +### Temporal (durability) |
| 28 | + |
| 29 | +A workflow orchestrator. The workshop uses a single `AgentWorkflow` that calls an Activity, feeds the result back to the LLM, and loops until the LLM is satisfied. Every tool invocation is its own Activity with its own retry policy. If the worker dies mid-reasoning, the workflow picks up exactly where it left off. |
| 30 | + |
| 31 | +### Tailscale (networking) |
| 32 | + |
| 33 | +A mesh VPN built on WireGuard. Every workshop VM joins one tailnet and can reach `temporal-dev:7233` and `http://ai` as if they were on a local network, with no port forwarding, no firewall rules, and no VPN concentrator. |
| 34 | + |
| 35 | +### `temporal-ts-net` (the glue) |
| 36 | + |
| 37 | +A Temporal CLI extension that runs the dev server and joins the tailnet via [tsnet](https://pkg.go.dev/tailscale.com/tsnet). Six lines of Go wrap `temporal server start-dev` and put it on the network under a memorable hostname. Installed on the shared VPS; the workshop consumes the hostname it exposes. Source: [temporal-community/temporal-ts-net](https://github.com/temporal-community/temporal-ts-net). |
| 38 | + |
| 39 | +### Aperture (API gateway) |
| 40 | + |
| 41 | +Sits between attendee code and OpenAI. Holds the real API key, forwards requests to `api.openai.com`, and enforces per-identity rate limits using the caller's Tailscale identity. Attendees never see the key, and one person running 500 agents can't burn the whole budget. |
| 42 | + |
| 43 | +## The two agent patterns |
| 44 | + |
| 45 | +### Tool-calling (Exercise 3, Part 1) |
| 46 | + |
| 47 | +```mermaid |
| 48 | +flowchart TD |
| 49 | + U["User: 'Weather alerts in California?'"] |
| 50 | + LLM["LLM via Aperture"] |
| 51 | + A["Temporal Activity<br/>get_weather_alerts('CA')<br/>calls NWS API"] |
| 52 | + R["'There are 3 active alerts<br/>in California...'"] |
| 53 | + U --> LLM |
| 54 | + LLM -->|decides to call tool| A |
| 55 | + A -->|weather data| LLM |
| 56 | + LLM --> R |
| 57 | +``` |
| 58 | + |
| 59 | +One decision, one tool, one formatted response. Useful on its own (for example, "answer questions about our API docs") and a stepping stone to the loop. |
| 60 | + |
| 61 | +### Agentic loop (Exercise 3, Part 2) |
| 62 | + |
| 63 | +```mermaid |
| 64 | +flowchart TD |
| 65 | + U["User: 'What's the weather where I am?'"] |
| 66 | + subgraph Loop["Agentic Loop, repeats until LLM is done"] |
| 67 | + direction LR |
| 68 | + P[LLM picks a tool] --> E[Execute Activity] |
| 69 | + E --> F[Feed result back to LLM] |
| 70 | + F --> P |
| 71 | + end |
| 72 | + U --> Loop |
| 73 | + Loop --> R[Final response] |
| 74 | +``` |
| 75 | + |
| 76 | +The LLM reasons through multiple steps on its own. The workshop's weather agent chains `get_ip_address`, `get_location_info`, `get_weather_alerts` with no hand-coded flow control; each tool is dynamically dispatched based on what the LLM asks for. |
| 77 | + |
| 78 | +## Aperture in the middle |
| 79 | + |
| 80 | +```mermaid |
| 81 | +sequenceDiagram |
| 82 | + autonumber |
| 83 | + participant VM as Attendee VM |
| 84 | + participant AP as Aperture |
| 85 | + participant OAI as OpenAI |
| 86 | + VM->>AP: POST /v1/responses<br/>(no API key needed) |
| 87 | + Note over AP: Identity: your-vm<br/>Rate: 3 / 10 requests |
| 88 | + AP->>OAI: POST /v1/responses<br/>Authorization: Bearer sk-real-openai-key |
| 89 | + OAI-->>AP: response |
| 90 | + AP-->>VM: response |
| 91 | +``` |
| 92 | + |
| 93 | +Two things to notice: |
| 94 | + |
| 95 | +1. The attendee never has the real key. They POST to `http://ai/v1/responses` with no Authorization header, and Aperture swaps in the real credential on the way out. |
| 96 | +2. Aperture uses the **caller's Tailscale identity** as the rate-limit key. No extra auth tokens, no per-attendee secrets to distribute. Whoever the tailnet says you are, that's who Aperture bills. |
| 97 | + |
| 98 | +## Why this stack |
| 99 | + |
| 100 | +Each piece removes a category of operational pain: |
| 101 | + |
| 102 | +- **Temporal** turns "my agent crashed halfway through reasoning" from a lost conversation into a resumed one. |
| 103 | +- **Tailscale** deletes VPN and firewall setup from the attendee onboarding path. |
| 104 | +- **`temporal-ts-net`** means the shared Temporal server is a hostname, not an IP:port behind a load balancer. |
| 105 | +- **Aperture** means one OpenAI key can serve 50 people without any of them seeing it, and without one person draining the budget. |
| 106 | + |
| 107 | +Remove any one of those and the workshop gets substantially harder to run. |
0 commit comments