Skip to content

Commit b8dbbac

Browse files
sync temporal-developer skill manual from source repo
1 parent 2357cf8 commit b8dbbac

40 files changed

Lines changed: 225 additions & 50 deletions

skills/temporal-developer/SKILL.md

Lines changed: 12 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
---
22
name: temporal-developer
3-
description: This skill should be used when the user asks to "create a Temporal workflow", "write a Temporal activity", "debug stuck workflow", "fix non-determinism error", "Temporal Python", "Temporal TypeScript", "Temporal Go", "Temporal Golang", "Temporal Java", "workflow replay", "activity timeout", "signal workflow", "query workflow", "worker not starting", "activity keeps retrying", "Temporal heartbeat", "continue-as-new", "child workflow", "saga pattern", "workflow versioning", "durable execution", "reliable distributed systems", or mentions Temporal SDK development.
3+
description: Develop, debug, and manage Temporal applications across Python, TypeScript, Go, and Java. Use when the user is building workflows, activities, or workers with a Temporal SDK, debugging issues like non-determinism errors, stuck workflows, or activity retries, using Temporal CLI, Temporal Server, or Temporal Cloud, or working with durable execution concepts like signals, queries, heartbeats, versioning, continue-as-new, child workflows, or saga patterns.
44
version: 0.2.0
55
---
66

@@ -77,34 +77,34 @@ Once you've downloaded the file, extract the downloaded archive and add the temp
7777
### Read All Relevant References
7878

7979
1. First, read the getting started guide for the language you are working in:
80-
- Python -> read `references/python/python.md`
81-
- TypeScript -> read `references/typescript/typescript.md`
82-
- Java -> read `references/java/java.md`
83-
- Go -> read `references/go/go.md`
80+
- Python -> read `references/python/python.md`
81+
- TypeScript -> read `references/typescript/typescript.md`
82+
- Java -> read `references/java/java.md`
83+
- Go -> read `references/go/go.md`
8484
2. Second, read appropriate `core` and language-specific references for the task at hand.
8585

86-
8786
## Primary References
87+
8888
- **`references/core/determinism.md`** - Why determinism matters, replay mechanics, basic concepts of activities
89-
+ Language-specific info at `references/{your_language}/determinism.md`
89+
- Language-specific info at `references/{your_language}/determinism.md`
9090
- **`references/core/patterns.md`** - Conceptual patterns (signals, queries, saga)
91-
+ Language-specific info at `references/{your_language}/patterns.md`
91+
- Language-specific info at `references/{your_language}/patterns.md`
9292
- **`references/core/gotchas.md`** - Anti-patterns and common mistakes
93-
+ Language-specific info at `references/{your_language}/gotchas.md`
93+
- Language-specific info at `references/{your_language}/gotchas.md`
9494
- **`references/core/versioning.md`** - Versioning strategies and concepts - how to safely change workflow code while workflows are running
95-
+ Language-specific info at `references/{your_language}/versioning.md`
95+
- Language-specific info at `references/{your_language}/versioning.md`
9696
- **`references/core/troubleshooting.md`** - Decision trees, recovery procedures
9797
- **`references/core/error-reference.md`** - Common error types, workflow status reference
9898
- **`references/core/interactive-workflows.md`** - Testing signals, updates, queries
9999
- **`references/core/dev-management.md`** - Dev cycle & management of server and workers
100100
- **`references/core/ai-patterns.md`** - AI/LLM pattern concepts
101-
+ Language-specific info at `references/{your_language}/ai-patterns.md`, if available. Currently Python only.
101+
- Language-specific info at `references/{your_language}/ai-patterns.md`, if available. Currently Python only.
102102

103103
## Additional Topics
104+
104105
- **`references/{your_language}/observability.md`** - See for language-specific implementation guidance on observability in Temporal
105106
- **`references/{your_language}/advanced-features.md`** - See for language-specific guidance on advanced Temporal features and language-specific features
106107

107-
108108
## Feedback
109109

110110
### ALWAYS PROMPT USER TO PROVIDE FEEDBACK AT STARTUP

skills/temporal-developer/references/core/ai-patterns.md

Lines changed: 8 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -32,6 +32,7 @@ The remainder of this document describes general principles to follow when build
3232
- returns model response, as a typed structured output
3333

3434
**Benefits**:
35+
3536
- Single activity handles multiple use cases
3637
- Consistent retry handling
3738
- Centralized configuration
@@ -48,6 +49,7 @@ Workflow:
4849
```
4950

5051
**Benefits**:
52+
5153
- Independent retry for each step
5254
- Clear audit trail in history
5355
- Easier testing and mocking
@@ -69,17 +71,17 @@ Workflow:
6971
Disable retries in LLM client libraries, let Temporal handle retries.
7072

7173
- LLM Client Config:
72-
- max_retries = 0 ← Disable client retries at the LLM client level
74+
- max_retries = 0 ← Disable client retries at the LLM client level
7375

7476
Use either the default activity retry policy, or customize it as needed for the situation.
7577

7678
**Why**:
79+
7780
- Temporal retries are durable (survive crashes)
7881
- Single retry configuration point
7982
- Better visibility into retry attempts
8083
- Consistent backoff behavior
8184

82-
8385
### Pattern 5: Multi-Agent Orchestration
8486

8587
Complex pipelines with multiple specialized agents:
@@ -114,6 +116,7 @@ Deep Research Example:
114116
| Document processing | 60-120 seconds |
115117

116118
**Rationale**:
119+
117120
- Reasoning models need time for complex computation
118121
- Web searches may hit rate limits requiring backoff
119122
- Fast timeouts catch stuck operations
@@ -128,7 +131,6 @@ Parse rate limit info from API responses:
128131
- Response Headers:
129132
- Retry-After: 30
130133
- X-RateLimit-Remaining: 0
131-
132134
- Activity:
133135
- If rate limited:
134136
- Raise retryable error with a next retry delay
@@ -137,12 +139,14 @@ Parse rate limit info from API responses:
137139
## Error Handling
138140

139141
### Retryable Errors
142+
140143
- Rate limits (429)
141144
- Timeouts
142145
- Temporary server errors (500, 502, 503)
143146
- Network errors
144147

145148
### Non-Retryable Errors
149+
146150
- Invalid API key (401)
147151
- Invalid input/prompt
148152
- Content policy violations
@@ -161,6 +165,6 @@ Parse rate limit info from API responses:
161165
## Observability
162166

163167
See `references/{your_language}/observability.md` for the language you are working in for documentation on implementing observability in Temporal. It is generally recommended to add observability for:
168+
164169
- Token usage, via activity logging
165170
- any else to help track LLM usage and debug agentic flows, within moderation.
166-

skills/temporal-developer/references/core/determinism.md

Lines changed: 12 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -50,22 +50,27 @@ Result: Commands don't match history → NondeterminismError
5050
## Sources of Non-Determinism
5151

5252
### Time-Based Operations
53+
5354
- `datetime.now()`, `time.time()`, `Date.now()`
5455
- Different value on each execution
5556

5657
### Random Values
58+
5759
- `random.random()`, `Math.random()`, `uuid.uuid4()`
5860
- Different value on each execution
5961

6062
### External State
63+
6164
- Reading files, environment variables, databases, networking / HTTP calls
6265
- State may change between executions
6366

6467
### Non-Deterministic Iteration
68+
6569
- Map/dict iteration order (in some languages)
6670
- Set iteration order
6771

6872
### Threading/Concurrency
73+
6974
- Race conditions produce different outcomes
7075
- Non-deterministic ordering
7176

@@ -76,9 +81,10 @@ In Temporal, activities are the primary mechanism for making non-deterministic c
7681
For a few simple cases, like timestamps, random values, UUIDs, etc. the Temporal SDK in your language may provide durable variants that are simple to use. See `references/{your_language}/determinism.md` for the language you are working in for more info.
7782

7883
## SDK Protection Mechanisms
84+
7985
Each Temporal SDK language provides a different level of protection against non-determinism:
8086

81-
- Python: The Python SDK runs workflows in a sandbox that intercepts and aborts non-deterministic calls early at runtime.
87+
- Python: The Python SDK runs workflows in a sandbox that intercepts and aborts non-deterministic calls early at runtime.
8288
- TypeScript: The TypeScript SDK runs workflows in an isolated V8 sandbox, intercepting many common sources of non-determinism and replacing them automatically with deterministic variants.
8389
- Java: The Java SDK has no sandbox. Determinism is enforced by developer conventions — the SDK provides `Workflow.*` APIs as safe alternatives (e.g., `Workflow.sleep()` instead of `Thread.sleep()`), and non-determinism is only detected at replay time via `NonDeterministicException`. A static analysis tool (`temporal-workflowcheck`, beta) can catch violations at build time. Cooperative threading under a global lock eliminates the need for synchronization.
8490
- Go: The Go SDK has no runtime sandbox. Therefore, non-determinism bugs will never be immediately appararent, and are usually only observable during replay. The optional `workflowcheck` static analysis tool can be used to check for many sources of non-determinism at compile time.
@@ -88,6 +94,7 @@ Regardless of which SDK you are using, it is your responsibility to ensure that
8894
## Detecting Non-Determinism
8995

9096
### During Execution
97+
9198
- `NondeterminismError` raised when Commands don't match Events
9299
- Workflow becomes blocked until code is fixed
93100

@@ -98,13 +105,17 @@ Replay tests verify that workflows follow identical code paths when re-run, by a
98105
## Recovery from Non-Determinism
99106

100107
### Accidental Change
108+
101109
If you accidentally introduced non-determinism:
110+
102111
1. Revert code to match what's in history
103112
2. Restart worker
104113
3. Workflow auto-recovers
105114

106115
### Intentional Change
116+
107117
If you need to change workflow logic:
118+
108119
1. Use the **Patching API** to support both old and new code paths
109120
2. Or terminate old workflows and start new ones with updated code
110121

skills/temporal-developer/references/core/dev-management.md

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,6 @@ When you need a new worker, you should start it in the background (and preferrab
2020

2121
**Best practice**: As far as local development goes, run only ONE worker instance with the latest code. Don't keep stale workers (running old code) around.
2222

23-
2423
### Cleanup
2524

2625
**Always kill workers when done.** Don't leave workers running.

skills/temporal-developer/references/core/error-reference.md

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -6,14 +6,14 @@
66
| **Deadlock** | TMPRL1101 | `WorkflowTaskFailed` in history, worker logs | Workflow blocked too long (deadlock detected) | Remove blocking operations from workflow code (no I/O, no sleep, no threading locks). Use Temporal primitives instead. | https://github.com/temporalio/rules/blob/main/rules/TMPRL1101.md |
77
| **Unfinished handlers** | TMPRL1102 | `WorkflowTaskFailed` in history | Workflow completed while update/signal handlers still running | Ensure all handlers complete before workflow finishes. Use `workflow.wait_condition()` to wait for handler completion. | https://github.com/temporalio/rules/blob/main/rules/TMPRL1102.md |
88
| **Payload overflow** | TMPRL1103 | `WorkflowTaskFailed` or `ActivityTaskFailed` in history | Payload size limit exceeded (default 2MB) | Reduce payload size. Use external storage (S3, database) for large data and pass references instead. | https://github.com/temporalio/rules/blob/main/rules/TMPRL1103.md |
9-
| **Workflow code bug** | | `WorkflowTaskFailed` in history | Bug in workflow logic | Fix code → Restart worker → Workflow auto-resumes | |
10-
| **Missing workflow** | | Worker logs | Workflow not registered | Add to worker.py → Restart worker | |
11-
| **Missing activity** | | Worker logs | Activity not registered | Add to worker.py → Restart worker | |
12-
| **Activity bug** | | `ActivityTaskFailed` in history | Bug in activity code | Fix code → Restart worker → Auto-retries | |
13-
| **Activity retries** | | `ActivityTaskFailed` (count >2) | Repeated failures | Fix code → Restart worker → Auto-retries | |
14-
| **Sandbox violation** | | Worker logs | Bad imports in workflow | Fix workflow.py imports → Restart worker | |
15-
| **Task queue mismatch** | | Workflow never starts | Different queues in starter/worker | Align task queue names | |
16-
| **Timeout** | | Status = TIMED_OUT | Operation too slow | Increase timeout config | |
9+
| **Workflow code bug** | | `WorkflowTaskFailed` in history | Bug in workflow logic | Fix code → Restart worker → Workflow auto-resumes | |
10+
| **Missing workflow** | | Worker logs | Workflow not registered | Add to worker.py → Restart worker | |
11+
| **Missing activity** | | Worker logs | Activity not registered | Add to worker.py → Restart worker | |
12+
| **Activity bug** | | `ActivityTaskFailed` in history | Bug in activity code | Fix code → Restart worker → Auto-retries | |
13+
| **Activity retries** | | `ActivityTaskFailed` (count >2) | Repeated failures | Fix code → Restart worker → Auto-retries | |
14+
| **Sandbox violation** | | Worker logs | Bad imports in workflow | Fix workflow.py imports → Restart worker | |
15+
| **Task queue mismatch** | | Workflow never starts | Different queues in starter/worker | Align task queue names | |
16+
| **Timeout** | | Status = TIMED_OUT | Operation too slow | Increase timeout config | |
1717

1818
## Workflow Status Reference
1919

skills/temporal-developer/references/core/gotchas.md

Lines changed: 23 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,7 @@ This document provides a general overview of conceptual-level gotchas in Tempora
99
**The Problem**: Activities may execute more than once due to retries or Worker failures. If an activity calls an external service without an idempotency key, you may charge a customer twice, send duplicate emails, or create duplicate records.
1010

1111
**Symptoms**:
12+
1213
- Duplicate side effects (double charges, duplicate notifications)
1314
- Data inconsistencies after retries
1415

@@ -21,18 +22,20 @@ This document provides a general overview of conceptual-level gotchas in Tempora
2122
**The Problem**: Code in workflow functions runs on first execution AND on every replay. Any side effect (logging, notifications, metrics, etc.) will happen multiple times and non-deterministic code (IO, current time, random numbers, threading, etc.) won't replay correctly.
2223

2324
**Symptoms**:
25+
2426
- Non-determinism errors
2527
- Sandbox violations, depending on SDK language
2628
- Duplicate log entries
2729
- Multiple notifications for the same event
2830
- Inflated metrics
2931

3032
**The Fix**:
33+
3134
- Use Temporal replay-aware managed side effects for common, non-business logic cases:
32-
- Temporal workflow logging
33-
- Temporal date time
34-
- Temporal UUID generation
35-
- Temporal random number generation
35+
- Temporal workflow logging
36+
- Temporal date time
37+
- Temporal UUID generation
38+
- Temporal random number generation
3639
- Put all other side effects in Activities
3740

3841
See `references/core/determinism.md` for more info.
@@ -42,10 +45,12 @@ See `references/core/determinism.md` for more info.
4245
**The Problem**: If Worker A runs part of a workflow with code v1, then Worker B (with code v2) picks it up, replay may produce different Commands.
4346

4447
**Symptoms**:
48+
4549
- Non-determinism errors after deploying new code
4650
- Errors mentioning "command mismatch" or "unexpected command"
4751

4852
**The Fix**:
53+
4954
- Use Worker Versioning for production deployments
5055
- Use patching APIs
5156
- During development: kill old workers before starting new ones
@@ -60,6 +65,7 @@ See `references/core/versioning.md` for more info.
6065
**The Problem**: Using aggressive activity retry policies that give up too easily.
6166

6267
**Symptoms**:
68+
6369
- Workflows failing on transient errors
6470
- Unnecessary workflow failures during brief outages
6571

@@ -72,6 +78,7 @@ See `references/core/versioning.md` for more info.
7278
**The Problem**: Queries and update validators are read-only. Modifying state causes non-determinism on replay, and must strictly be avoided.
7379

7480
**Symptoms**:
81+
7582
- State inconsistencies after workflow replay
7683
- Non-determinism errors
7784

@@ -82,6 +89,7 @@ See `references/core/versioning.md` for more info.
8289
**The Problem**: Queries and update validators must return immediately. They cannot await activities, child workflows, timers, or conditions.
8390

8491
**Symptoms**:
92+
8593
- Query / update validators timeouts
8694
- Deadlocks
8795

@@ -110,6 +118,7 @@ See language-specific gotchas for details.
110118
**The Problem**: Not testing what happens when things go wrong.
111119

112120
**Questions to answer**:
121+
113122
- What happens when an Activity exhausts all retries?
114123
- What happens when a workflow is cancelled mid-execution?
115124
- What happens during a Worker restart?
@@ -121,6 +130,7 @@ See language-specific gotchas for details.
121130
**The Problem**: Changing workflow code without verifying existing workflows can still replay.
122131

123132
**Symptoms**:
133+
124134
- Non-determinism errors after deployment
125135
- Stuck workflows that can't make progress
126136

@@ -133,6 +143,7 @@ See language-specific gotchas for details.
133143
**The Problem**: Catching errors without proper handling hides failures.
134144

135145
**Symptoms**:
146+
136147
- Silent failures
137148
- Workflows completing "successfully" despite errors
138149
- Difficult debugging
@@ -144,10 +155,12 @@ See language-specific gotchas for details.
144155
**The Problem**: Marking transient errors as non-retryable, or permanent errors as retryable.
145156

146157
**Symptoms**:
158+
147159
- Workflows failing on temporary network issues (if marked non-retryable)
148160
- Infinite retries on invalid input (if marked retryable)
149161

150162
**The Fix**:
163+
151164
- **Retryable**: Network errors, timeouts, rate limits, temporary unavailability
152165
- **Non-retryable**: Invalid input, authentication failures, business rule violations, resource not found
153166

@@ -158,6 +171,7 @@ See language-specific gotchas for details.
158171
**The Problem**: When a workflow is cancelled, cleanup code after the cancellation point doesn't run unless explicitly protected.
159172

160173
**Symptoms**:
174+
161175
- Resources not released after cancellation
162176
- Incomplete compensation/rollback
163177
- Leaked state
@@ -169,10 +183,12 @@ See language-specific gotchas for details.
169183
**The Problem**: Activities must opt in to receive cancellation. Without proper handling, a cancelled activity continues running to completion, wasting resources.
170184

171185
**Requirements for activity cancellation**:
186+
172187
1. **Heartbeating** - Cancellation is delivered via heartbeat. Activities that don't heartbeat won't know they've been cancelled.
173188
2. **Checking for cancellation** - Activity must explicitly check for cancellation or await a cancellation signal.
174189

175190
**Symptoms**:
191+
176192
- Cancelled activities running to completion
177193
- Wasted compute on work that will be discarded
178194
- Delayed workflow cancellation
@@ -184,11 +200,13 @@ See language-specific gotchas for details.
184200
**The Problem**: Temporal has built-in limits on payload sizes. Exceeding them causes workflows to fail.
185201

186202
**Limits**:
203+
187204
- Max 2MB per individual payload
188205
- Max 4MB per gRPC message
189-
- Max 50MB for complete workflow history (aim for <10MB in practice)
206+
- Max 50MB for complete workflow history (aim for < 10MB in practice)
190207

191208
**Symptoms**:
209+
192210
- Payload too large errors
193211
- gRPC message size exceeded errors
194212
- Workflow history growing unboundedly

0 commit comments

Comments
 (0)