You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/develop/python/data-handling/index.mdx
+14-13Lines changed: 14 additions & 13 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -3,30 +3,31 @@ id: data-handling
3
3
title: Data handling - Python SDK
4
4
sidebar_label: Data handling
5
5
slug: /develop/python/data-handling
6
-
description: Learn how Temporal handles data through the Data Converter, including payload conversion, encryption, and large payload storage.
7
-
toc_max_heading_level: 2
6
+
description:
7
+
Learn how Temporal handles data through the Data Converter, including payload conversion, encryption, and large
8
+
payload storage.
9
+
toc_max_heading_level: 3
8
10
tags:
9
11
- Python SDK
10
12
- Temporal SDKs
11
13
- Data Converters
12
14
---
13
15
14
-
All data sent to and from the Temporal Service passes through the **Data Converter**.
15
-
The Data Converter has three layers that handle different concerns:
16
+
All data sent to and from the Temporal Service passes through the **Data Converter**. The Data Converter has three
17
+
layers that handle different concerns:
16
18
17
19
```
18
20
Application data → PayloadConverter → PayloadCodec → ExternalStorage → Temporal Service
19
21
```
20
22
21
-
Of these three layers, only the PayloadConverter is required. Temporal uses a default PayloadConverter that handles JSON serialization. The PayloadCodec and ExternalStorage layers are optional.
23
+
Of these three layers, only the PayloadConverter is required. Temporal uses a default PayloadConverter that handles JSON
24
+
serialization. The PayloadCodec and ExternalStorage layers are optional. You only need to customize these layers when
25
+
your application requires non-JSON types, encryption, or payload offloading.
description: Offload large payloads to external storage using the claim check pattern in the Python SDK.
12
12
---
13
13
14
-
The Temporal Service enforces a ~2 MB per payload limit.
15
-
When your Workflows or Activities handle data larger than this, you can offload payloads to external storage (such as S3) and pass a small reference token through the event history instead.
16
-
This is sometimes called the [claim check pattern](https://en.wikipedia.org/wiki/Claim_check_pattern).
14
+
The Temporal Service enforces a ~2 MB per payload limit. When your Workflows or Activities handle data larger than the
15
+
limit, you can offload payloads to external storage, such as S3, and pass a small reference token through the event
16
+
history instead. This is sometimes called the [claim check pattern](https://en.wikipedia.org/wiki/Claim_check_pattern).
17
17
18
18
External storage sits at the end of the data pipeline, after both the Payload Converter and the Payload Codec:
19
19
20
20
```
21
21
User code → PayloadConverter → PayloadCodec → External Storage → Temporal Service
22
22
```
23
23
24
-
When a payload exceeds a configurable size threshold (default 256 KiB), the storage driver uploads it to your external store and replaces it with a lightweight reference.
25
-
Payloads below the threshold stay inline in the event history.
26
-
On the way back, reference payloads are retrieved from external storage before the codec decodes them.
24
+
When a payload exceeds a configurable size threshold (default 256 KiB), the storage driver uploads it to your external
25
+
store and replaces it with a lightweight reference. Payloads below the threshold stay inline in the event history. On
26
+
the way back, reference payloads are retrieved from external storage before the codec decodes them.
27
27
28
-
Because external storage runs after the codec, payloads are already encrypted (if you use an encryption codec) before they're uploaded to your store.
28
+
Because external storage runs after the codec. If you use an encryption codec, payloads are already encrypted before
29
+
they're uploaded to your store.
29
30
30
31
## Store and retrieve large payloads using external storage
31
32
32
-
To offload large payloads, implement a `StorageDriver` and configure it on your `DataConverter`.
33
-
The driver needs a `store()` method to upload payloads and a `retrieve()` method to fetch them back.
33
+
To offload large payloads, implement a `StorageDriver` and configure it on your `DataConverter`. The driver needs a
34
+
`store()` method to upload payloads and a `retrieve()` method to fetch them back.
35
+
36
+
Once you implement a storage driver, configure it on your `DataConverter` and use it when creating your Client and
37
+
Worker. All Workflows and Activities running on the Worker will use the storage drive automatically without changes to
38
+
your business logic. You can also configure the size threshold and use multiple storage drivers.
34
39
35
40
### Implement a storage driver
36
41
@@ -66,8 +71,9 @@ class S3StorageDriver(StorageDriver):
66
71
67
72
### Store payloads
68
73
69
-
The `store()` method receives a sequence of payloads and must return exactly one `StorageDriverClaim` per payload.
70
-
A claim is a set of string key-value pairs that the driver uses to locate the payload later — typically a storage key or URL.
74
+
The `store()` method receives a sequence of payloads and must return exactly one `StorageDriverClaim` per payload. A
75
+
claim is a set of string key-value pairs that the driver uses to locate the payload later — typically a storage key or
76
+
URL.
71
77
72
78
```python
73
79
Sample implementation:
@@ -109,8 +115,8 @@ converter = DataConverter(
109
115
110
116
### Adjust the size threshold
111
117
112
-
The `payload_size_threshold` controls which payloads get offloaded.
113
-
Payloads smaller than this value stay inline in the event history.
118
+
The `payload_size_threshold` controls which payloads get offloaded. Payloads smaller than this value stay inline in the
119
+
event history.
114
120
115
121
```python
116
122
ExternalStorage(
@@ -123,7 +129,8 @@ Set it to `None` to externalize all payloads regardless of size.
123
129
124
130
### Use multiple storage drivers
125
131
126
-
When you have multiple drivers (for example, hot and cold storage tiers), provide a `driver_selector` function that chooses which driver handles each payload:
132
+
When you have multiple drivers (for example, hot and cold storage tiers), provide a `driver_selector` function that
Copy file name to clipboardExpand all lines: docs/encyclopedia/data-conversion/dataconversion.mdx
+33-19Lines changed: 33 additions & 19 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,7 +2,9 @@
2
2
id: dataconversion
3
3
title: How does Temporal handle application data?
4
4
sidebar_label: Data conversion
5
-
description: This guide explores Data Converters in the Temporal Platform, detailing how they handle serialization and encoding for Workflow inputs and outputs, ensuring data stays secure and manageable.
5
+
description:
6
+
This guide explores Data Converters in the Temporal Platform, detailing how they handle serialization and encoding for
7
+
Workflow inputs and outputs, ensuring data stays secure and manageable.
6
8
slug: /dataconversion
7
9
toc_max_heading_level: 4
8
10
keywords:
@@ -23,25 +25,31 @@ import { CaptionedImage } from '@site/src/components';
23
25
24
26
This guide provides an overview of data handling using a Data Converter on the Temporal Platform.
25
27
26
-
Data Converters in Temporal are SDK components that handle the serialization and encoding of data entering and exiting a Temporal Service.
27
-
Workflow inputs and outputs need to be serialized and deserialized so they can be sent as JSON to a Temporal Service.
28
+
Data Converters in Temporal are SDK components that handle the serialization and encoding of data entering and exiting a
29
+
Temporal Service. Workflow inputs and outputs need to be serialized and deserialized so they can be sent as JSON to a
30
+
Temporal Service.
28
31
29
-
<CaptionedImage
30
-
src="/diagrams/default-data-converter.svg"
31
-
title="Data Converter encodes and decodes data"
32
-
/>
32
+
<CaptionedImagesrc="/diagrams/default-data-converter.svg"title="Data Converter encodes and decodes data" />
33
33
34
-
The Data Converter encodes data from your application to a [Payload](/dataconversion#payload) before it is sent to the Temporal Service in the Client call.
35
-
When the Temporal Server sends the encoded data back to the Worker, the Data Converter decodes it for processing within your application.
36
-
This ensures that all your sensitive data exists in its original format only on hosts that you control.
34
+
The Data Converter encodes data from your application to a [Payload](/dataconversion#payload) before it is sent to the
35
+
Temporal Service in the Client call. When the Temporal Server sends the encoded data back to the Worker, the Data
36
+
Converter decodes it for processing within your application. This ensures that all your sensitive data exists in its
37
+
original format only on hosts that you control.
37
38
38
-
Data Converter steps are followed when data is sent to a Temporal Service (as input to a Workflow) and when it is returned from a Workflow (as output).
39
-
Due to how Temporal provides access to Workflow output, this implementation is asymmetric:
39
+
Data Converter steps are followed when data is sent to a Temporal Service (as input to a Workflow) and when it is
40
+
returned from a Workflow (as output). Due to how Temporal provides access to Workflow output, this implementation is
41
+
asymmetric:
40
42
41
-
- Data encoding is performed automatically using the default converter provided by Temporal or your custom Data Converter when passing input to a Temporal Service. For example, plain text input is usually serialized into a JSON object.
42
-
- Data decoding may be performed by your application logic during your Workflows or Activities as necessary, but decoded Workflow results are never persisted back to the Temporal Service. Instead, they are stored encoded on the Temporal Service, and you need to provide an additional parameter when using [`temporal workflow show`](/cli/workflow#show) or when browsing the Web UI to view output.
43
+
- Data encoding is performed automatically using the default converter provided by Temporal or your custom Data
44
+
Converter when passing input to a Temporal Service. For example, plain text input is usually serialized into a JSON
45
+
object.
46
+
- Data decoding may be performed by your application logic during your Workflows or Activities as necessary, but decoded
47
+
Workflow results are never persisted back to the Temporal Service. Instead, they are stored encoded on the Temporal
48
+
Service, and you need to provide an additional parameter when using [`temporal workflow show`](/cli/workflow#show) or
49
+
when browsing the Web UI to view output.
43
50
44
-
Each piece of data (like a single argument or return value) is encoded as a [Payload](/dataconversion#payload), which consists of binary data and key-value metadata.
51
+
Each piece of data (like a single argument or return value) is encoded as a [Payload](/dataconversion#payload), which
52
+
consists of binary data and key-value metadata.
45
53
46
54
For details, see the API references:
47
55
@@ -52,10 +60,16 @@ For details, see the API references:
52
60
53
61
### What is a Payload? {#payload}
54
62
55
-
A [Payload](https://api-docs.temporal.io/#temporal.api.common.v1.Payload) represents binary data such as input and output from Activities and Workflows.
56
-
Payloads also contain metadata that describe their data type or other parameters for use by custom encoders/converters.
63
+
A [Payload](https://api-docs.temporal.io/#temporal.api.common.v1.Payload) represents binary data such as input and
64
+
output from Activities and Workflows. Payloads also contain metadata that describe their data type or other parameters
65
+
for use by custom encoders/converters.
57
66
58
-
When processed through the SDK, the [default Data Converter](/default-custom-data-converters#default-data-converter) serializes your data/value to a Payload before sending it to the Temporal Server.
59
-
The default Data Converter processes supported type values to Payloads. You can create a custom [Payload Converter](/payload-converter) to apply different conversion steps.
67
+
When processed through the SDK, the [default Data Converter](/default-custom-data-converters#default-data-converter)
68
+
serializes your data/value to a Payload before sending it to the Temporal Server. The default Data Converter processes
69
+
supported type values to Payloads. You can create a custom [Payload Converter](/payload-converter) to apply different
70
+
conversion steps.
60
71
61
72
You can additionally apply [custom codecs](/payload-codec), such as for encryption or compression, on your Payloads.
73
+
74
+
When Payloads are too large for the Temporal Service's ~2 MB limit, you can use [External Storage](/external-storage) to
75
+
offload them to an external store like S3 and keep only a reference in the Event History.
Copy file name to clipboardExpand all lines: docs/troubleshooting/blob-size-limit-error.mdx
+9-9Lines changed: 9 additions & 9 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -29,22 +29,22 @@ To resolve this error, reduce the size of the blob so that it is within the 4 MB
29
29
30
30
There are multiple strategies you can use to avoid this error:
31
31
32
-
1. Use compression with a [custom payload codec](/payload-codec) for large payloads.
32
+
1. Use [External Storage](/external-storage) to offload large payloads to an object store like S3. The Temporal SDKs support this natively through the claim check pattern: when a payload exceeds a size threshold, a storage driver uploads it to your external store and replaces it with a small reference token in the Event History. Your Workflow and Activity code doesn't need to change. Even if your payloads are within the limit today, consider implementing External Storage if their size could grow over time.
33
33
34
-
- This addresses the immediate issue of the blob size limit; however, if blob sizes continue to grow this problem can arise again.
34
+
For SDK-specific guides, see:
35
+
-[Python: Large payload storage](/develop/python/data-handling/large-payload-storage)
36
+
-[TypeScript: Large payload storage](/develop/typescript/data-handling/large-payload-storage)
35
37
36
-
2. Break larger batches of commands into smaller batch sizes:
38
+
2. Use compression with a [custom Payload Codec](/payload-codec) for large payloads. This addresses the immediate issue, but if payload sizes continue to grow, the problem can arise again.
39
+
40
+
3. Break larger batches of commands into smaller batch sizes:
37
41
38
42
- Workflow-level batching:
39
-
1.Modify the Workflow to process Activities or Child Workflows into smaller batches.
43
+
1.Change the Workflow to process Activities or Child Workflows into smaller batches.
40
44
2. Iterate through each batch, waiting for completion before moving to the next.
41
45
- Workflow Task-level batching:
42
46
1. Execute Activities in smaller batches within a single Workflow Task.
43
-
2. Introduce brief pauses or sleeps (for example, 1ms) between batches.
44
-
45
-
3. Consider offloading large payloads to an object store to reduce the risk of exceeding blob size limits:
46
-
1. Pass references to the stored payloads within the Workflow instead of the actual data.
47
-
2. Retrieve the payloads from the object store when needed during execution.
47
+
2. Introduce brief pauses or sleeps between batches.
0 commit comments