You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: skills/temporal-developer/SKILL.md
+12-12Lines changed: 12 additions & 12 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,6 +1,6 @@
1
1
---
2
2
name: temporal-developer
3
-
description: This skill should be used when the user asks to "create a Temporal workflow", "write a Temporal activity", "debug stuck workflow", "fix non-determinism error", "Temporal Python", "Temporal TypeScript", "Temporal Go", "Temporal Golang", "Temporal Java", "workflow replay", "activity timeout", "signal workflow", "query workflow", "worker not starting", "activity keeps retrying", "Temporal heartbeat", "continue-as-new", "child workflow", "saga pattern", "workflow versioning", "durable execution", "reliable distributed systems", or mentions Temporal SDK development.
3
+
description: Develop, debug, and manage Temporal applications across Python, TypeScript, Go, and Java. Use when the user is building workflows, activities, or workers with a Temporal SDK, debugging issues like non-determinism errors, stuck workflows, or activity retries, using Temporal CLI, Temporal Server, or Temporal Cloud, or working with durable execution concepts like signals, queries, heartbeats, versioning, continue-as-new, child workflows, or saga patterns.
4
4
version: 0.2.0
5
5
---
6
6
@@ -77,34 +77,34 @@ Once you've downloaded the file, extract the downloaded archive and add the temp
77
77
### Read All Relevant References
78
78
79
79
1. First, read the getting started guide for the language you are working in:
+ Language-specific info at `references/{your_language}/ai-patterns.md`, if available. Currently Python only.
101
+
- Language-specific info at `references/{your_language}/ai-patterns.md`, if available. Currently Python only.
102
102
103
103
## Additional Topics
104
+
104
105
-**`references/{your_language}/observability.md`** - See for language-specific implementation guidance on observability in Temporal
105
106
-**`references/{your_language}/advanced-features.md`** - See for language-specific guidance on advanced Temporal features and language-specific features
106
107
107
-
108
108
## Feedback
109
109
110
110
### ALWAYS PROMPT USER TO PROVIDE FEEDBACK AT STARTUP
Copy file name to clipboardExpand all lines: skills/temporal-developer/references/core/ai-patterns.md
+8-4Lines changed: 8 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -32,6 +32,7 @@ The remainder of this document describes general principles to follow when build
32
32
- returns model response, as a typed structured output
33
33
34
34
**Benefits**:
35
+
35
36
- Single activity handles multiple use cases
36
37
- Consistent retry handling
37
38
- Centralized configuration
@@ -48,6 +49,7 @@ Workflow:
48
49
```
49
50
50
51
**Benefits**:
52
+
51
53
- Independent retry for each step
52
54
- Clear audit trail in history
53
55
- Easier testing and mocking
@@ -69,17 +71,17 @@ Workflow:
69
71
Disable retries in LLM client libraries, let Temporal handle retries.
70
72
71
73
- LLM Client Config:
72
-
- max_retries = 0 ← Disable client retries at the LLM client level
74
+
- max_retries = 0 ← Disable client retries at the LLM client level
73
75
74
76
Use either the default activity retry policy, or customize it as needed for the situation.
75
77
76
78
**Why**:
79
+
77
80
- Temporal retries are durable (survive crashes)
78
81
- Single retry configuration point
79
82
- Better visibility into retry attempts
80
83
- Consistent backoff behavior
81
84
82
-
83
85
### Pattern 5: Multi-Agent Orchestration
84
86
85
87
Complex pipelines with multiple specialized agents:
@@ -114,6 +116,7 @@ Deep Research Example:
114
116
| Document processing | 60-120 seconds |
115
117
116
118
**Rationale**:
119
+
117
120
- Reasoning models need time for complex computation
118
121
- Web searches may hit rate limits requiring backoff
119
122
- Fast timeouts catch stuck operations
@@ -128,7 +131,6 @@ Parse rate limit info from API responses:
128
131
- Response Headers:
129
132
- Retry-After: 30
130
133
- X-RateLimit-Remaining: 0
131
-
132
134
- Activity:
133
135
- If rate limited:
134
136
- Raise retryable error with a next retry delay
@@ -137,12 +139,14 @@ Parse rate limit info from API responses:
137
139
## Error Handling
138
140
139
141
### Retryable Errors
142
+
140
143
- Rate limits (429)
141
144
- Timeouts
142
145
- Temporary server errors (500, 502, 503)
143
146
- Network errors
144
147
145
148
### Non-Retryable Errors
149
+
146
150
- Invalid API key (401)
147
151
- Invalid input/prompt
148
152
- Content policy violations
@@ -161,6 +165,6 @@ Parse rate limit info from API responses:
161
165
## Observability
162
166
163
167
See `references/{your_language}/observability.md` for the language you are working in for documentation on implementing observability in Temporal. It is generally recommended to add observability for:
168
+
164
169
- Token usage, via activity logging
165
170
- any else to help track LLM usage and debug agentic flows, within moderation.
@@ -76,9 +81,10 @@ In Temporal, activities are the primary mechanism for making non-deterministic c
76
81
For a few simple cases, like timestamps, random values, UUIDs, etc. the Temporal SDK in your language may provide durable variants that are simple to use. See `references/{your_language}/determinism.md` for the language you are working in for more info.
77
82
78
83
## SDK Protection Mechanisms
84
+
79
85
Each Temporal SDK language provides a different level of protection against non-determinism:
80
86
81
-
- Python: The Python SDK runs workflows in a sandbox that intercepts and aborts non-deterministic calls early at runtime.
87
+
- Python: The Python SDK runs workflows in a sandbox that intercepts and aborts non-deterministic calls early at runtime.
82
88
- TypeScript: The TypeScript SDK runs workflows in an isolated V8 sandbox, intercepting many common sources of non-determinism and replacing them automatically with deterministic variants.
83
89
- Java: The Java SDK has no sandbox. Determinism is enforced by developer conventions — the SDK provides `Workflow.*` APIs as safe alternatives (e.g., `Workflow.sleep()` instead of `Thread.sleep()`), and non-determinism is only detected at replay time via `NonDeterministicException`. A static analysis tool (`temporal-workflowcheck`, beta) can catch violations at build time. Cooperative threading under a global lock eliminates the need for synchronization.
84
90
- Go: The Go SDK has no runtime sandbox. Therefore, non-determinism bugs will never be immediately appararent, and are usually only observable during replay. The optional `workflowcheck` static analysis tool can be used to check for many sources of non-determinism at compile time.
@@ -88,6 +94,7 @@ Regardless of which SDK you are using, it is your responsibility to ensure that
88
94
## Detecting Non-Determinism
89
95
90
96
### During Execution
97
+
91
98
-`NondeterminismError` raised when Commands don't match Events
92
99
- Workflow becomes blocked until code is fixed
93
100
@@ -98,13 +105,17 @@ Replay tests verify that workflows follow identical code paths when re-run, by a
98
105
## Recovery from Non-Determinism
99
106
100
107
### Accidental Change
108
+
101
109
If you accidentally introduced non-determinism:
110
+
102
111
1. Revert code to match what's in history
103
112
2. Restart worker
104
113
3. Workflow auto-recovers
105
114
106
115
### Intentional Change
116
+
107
117
If you need to change workflow logic:
118
+
108
119
1. Use the **Patching API** to support both old and new code paths
109
120
2. Or terminate old workflows and start new ones with updated code
Copy file name to clipboardExpand all lines: skills/temporal-developer/references/core/dev-management.md
-1Lines changed: 0 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -20,7 +20,6 @@ When you need a new worker, you should start it in the background (and preferrab
20
20
21
21
**Best practice**: As far as local development goes, run only ONE worker instance with the latest code. Don't keep stale workers (running old code) around.
22
22
23
-
24
23
### Cleanup
25
24
26
25
**Always kill workers when done.** Don't leave workers running.
Copy file name to clipboardExpand all lines: skills/temporal-developer/references/core/error-reference.md
+8-8Lines changed: 8 additions & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -6,14 +6,14 @@
6
6
|**Deadlock**| TMPRL1101 |`WorkflowTaskFailed` in history, worker logs | Workflow blocked too long (deadlock detected) | Remove blocking operations from workflow code (no I/O, no sleep, no threading locks). Use Temporal primitives instead. |https://github.com/temporalio/rules/blob/main/rules/TMPRL1101.md|
7
7
|**Unfinished handlers**| TMPRL1102 |`WorkflowTaskFailed` in history | Workflow completed while update/signal handlers still running | Ensure all handlers complete before workflow finishes. Use `workflow.wait_condition()` to wait for handler completion. |https://github.com/temporalio/rules/blob/main/rules/TMPRL1102.md|
8
8
|**Payload overflow**| TMPRL1103 |`WorkflowTaskFailed` or `ActivityTaskFailed` in history | Payload size limit exceeded (default 2MB) | Reduce payload size. Use external storage (S3, database) for large data and pass references instead. |https://github.com/temporalio/rules/blob/main/rules/TMPRL1103.md|
9
-
|**Workflow code bug**||`WorkflowTaskFailed` in history | Bug in workflow logic | Fix code → Restart worker → Workflow auto-resumes ||
10
-
|**Missing workflow**|| Worker logs | Workflow not registered | Add to worker.py → Restart worker ||
11
-
|**Missing activity**|| Worker logs | Activity not registered | Add to worker.py → Restart worker ||
12
-
|**Activity bug**||`ActivityTaskFailed` in history | Bug in activity code | Fix code → Restart worker → Auto-retries ||
Copy file name to clipboardExpand all lines: skills/temporal-developer/references/core/gotchas.md
+23-5Lines changed: 23 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -9,6 +9,7 @@ This document provides a general overview of conceptual-level gotchas in Tempora
9
9
**The Problem**: Activities may execute more than once due to retries or Worker failures. If an activity calls an external service without an idempotency key, you may charge a customer twice, send duplicate emails, or create duplicate records.
10
10
11
11
**Symptoms**:
12
+
12
13
- Duplicate side effects (double charges, duplicate notifications)
13
14
- Data inconsistencies after retries
14
15
@@ -21,18 +22,20 @@ This document provides a general overview of conceptual-level gotchas in Tempora
21
22
**The Problem**: Code in workflow functions runs on first execution AND on every replay. Any side effect (logging, notifications, metrics, etc.) will happen multiple times and non-deterministic code (IO, current time, random numbers, threading, etc.) won't replay correctly.
22
23
23
24
**Symptoms**:
25
+
24
26
- Non-determinism errors
25
27
- Sandbox violations, depending on SDK language
26
28
- Duplicate log entries
27
29
- Multiple notifications for the same event
28
30
- Inflated metrics
29
31
30
32
**The Fix**:
33
+
31
34
- Use Temporal replay-aware managed side effects for common, non-business logic cases:
32
-
- Temporal workflow logging
33
-
- Temporal date time
34
-
- Temporal UUID generation
35
-
- Temporal random number generation
35
+
- Temporal workflow logging
36
+
- Temporal date time
37
+
- Temporal UUID generation
38
+
- Temporal random number generation
36
39
- Put all other side effects in Activities
37
40
38
41
See `references/core/determinism.md` for more info.
@@ -42,10 +45,12 @@ See `references/core/determinism.md` for more info.
42
45
**The Problem**: If Worker A runs part of a workflow with code v1, then Worker B (with code v2) picks it up, replay may produce different Commands.
43
46
44
47
**Symptoms**:
48
+
45
49
- Non-determinism errors after deploying new code
46
50
- Errors mentioning "command mismatch" or "unexpected command"
47
51
48
52
**The Fix**:
53
+
49
54
- Use Worker Versioning for production deployments
50
55
- Use patching APIs
51
56
- During development: kill old workers before starting new ones
@@ -60,6 +65,7 @@ See `references/core/versioning.md` for more info.
60
65
**The Problem**: Using aggressive activity retry policies that give up too easily.
61
66
62
67
**Symptoms**:
68
+
63
69
- Workflows failing on transient errors
64
70
- Unnecessary workflow failures during brief outages
65
71
@@ -72,6 +78,7 @@ See `references/core/versioning.md` for more info.
72
78
**The Problem**: Queries and update validators are read-only. Modifying state causes non-determinism on replay, and must strictly be avoided.
73
79
74
80
**Symptoms**:
81
+
75
82
- State inconsistencies after workflow replay
76
83
- Non-determinism errors
77
84
@@ -82,6 +89,7 @@ See `references/core/versioning.md` for more info.
82
89
**The Problem**: Queries and update validators must return immediately. They cannot await activities, child workflows, timers, or conditions.
83
90
84
91
**Symptoms**:
92
+
85
93
- Query / update validators timeouts
86
94
- Deadlocks
87
95
@@ -110,6 +118,7 @@ See language-specific gotchas for details.
110
118
**The Problem**: Not testing what happens when things go wrong.
111
119
112
120
**Questions to answer**:
121
+
113
122
- What happens when an Activity exhausts all retries?
114
123
- What happens when a workflow is cancelled mid-execution?
115
124
- What happens during a Worker restart?
@@ -121,6 +130,7 @@ See language-specific gotchas for details.
121
130
**The Problem**: Changing workflow code without verifying existing workflows can still replay.
122
131
123
132
**Symptoms**:
133
+
124
134
- Non-determinism errors after deployment
125
135
- Stuck workflows that can't make progress
126
136
@@ -133,6 +143,7 @@ See language-specific gotchas for details.
133
143
**The Problem**: Catching errors without proper handling hides failures.
-**Non-retryable**: Invalid input, authentication failures, business rule violations, resource not found
153
166
@@ -158,6 +171,7 @@ See language-specific gotchas for details.
158
171
**The Problem**: When a workflow is cancelled, cleanup code after the cancellation point doesn't run unless explicitly protected.
159
172
160
173
**Symptoms**:
174
+
161
175
- Resources not released after cancellation
162
176
- Incomplete compensation/rollback
163
177
- Leaked state
@@ -169,10 +183,12 @@ See language-specific gotchas for details.
169
183
**The Problem**: Activities must opt in to receive cancellation. Without proper handling, a cancelled activity continues running to completion, wasting resources.
170
184
171
185
**Requirements for activity cancellation**:
186
+
172
187
1.**Heartbeating** - Cancellation is delivered via heartbeat. Activities that don't heartbeat won't know they've been cancelled.
173
188
2.**Checking for cancellation** - Activity must explicitly check for cancellation or await a cancellation signal.
174
189
175
190
**Symptoms**:
191
+
176
192
- Cancelled activities running to completion
177
193
- Wasted compute on work that will be discarded
178
194
- Delayed workflow cancellation
@@ -184,11 +200,13 @@ See language-specific gotchas for details.
184
200
**The Problem**: Temporal has built-in limits on payload sizes. Exceeding them causes workflows to fail.
185
201
186
202
**Limits**:
203
+
187
204
- Max 2MB per individual payload
188
205
- Max 4MB per gRPC message
189
-
- Max 50MB for complete workflow history (aim for <10MB in practice)
206
+
- Max 50MB for complete workflow history (aim for <10MB in practice)
0 commit comments