Skip to content

Commit 9ab0402

Browse files
authored
feat: add external storage support to codec server (#4489)
* docs: add external storage support * docs: remove redundant info from setup page * address review comments
1 parent f64a821 commit 9ab0402

4 files changed

Lines changed: 86 additions & 45 deletions

File tree

docs/encyclopedia/data-conversion/codec-server.mdx

Lines changed: 55 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -79,6 +79,61 @@ form.
7979
Your Codec Server should use the same Payload Codec implementation as your Workers to ensure consistent encoding and
8080
decoding.
8181

82+
## Codec Server with External Storage {#external-storage}
83+
84+
When your Workers and Clients use [External Storage](/external-storage), your storage drivers replace some payloads in
85+
the Event History with small references that point to data in an external store like Amazon S3. The Temporal Service and
86+
the Web UI only see these references, not the actual payload data. This is further complicated by setups where you run
87+
Codecs in proxy that encode payloads after the Data Converter has returned on the Worker. Your Codec Server must be able
88+
to handle downloading and decoding in the correct order for you to be able to view the Workflow data in the UI or CLI.
89+
90+
To support External Storage, create a handler using `NewPayloadHTTPHandler` with `PayloadHTTPHandlerOptions`. The options
91+
accept your storage drivers, your pre-storage codecs (the Payload Codecs configured in your Worker's Data Converter),
92+
and any post-storage codecs (codecs applied by a proxy after external storage). The handler applies them in the correct
93+
order across all endpoints automatically. When you configure the handler with storage drivers, the existing endpoints
94+
become storage-aware and a new `/download` endpoint becomes available:
95+
96+
:::caution
97+
98+
`NewPayloadHTTPHandler` runs the full encode-store-encode and decode-retrieve-decode pipeline. Do not use it as a target
99+
for a remote Data Converter or remote codec on your Workers. For remote codecs, use `NewPayloadCodecHTTPHandler`
100+
separately. If you need both, set up `NewPayloadHTTPHandler` for the Web UI and CLI alongside
101+
`NewPayloadCodecHTTPHandler` for your Workers, and configure both with the same codecs.
102+
103+
:::
104+
105+
- **`/download`** retrieves the actual payload data from external storage and decodes it through the Payload Codec. This
106+
endpoint is used internally by `/decode` when it encounters storage references, but you can also call it directly from
107+
to retrieve the decoded payload. The Temporal Web UI uses this endpoint when you click to view the full payload for a
108+
storage reference.
109+
- **`/decode`** still decodes encoded payloads, but also handles storage references. By default, `/decode` uses the
110+
download logic internally to retrieve and decode any storage references in the request alongside regular payloads.
111+
With the `?preserveStorageRefs=true` query parameter, `/decode` skips retrieval and returns storage references as-is.
112+
- **`/encode`** applies the Payload Codec, then uploads payloads that exceed the size threshold to external storage and
113+
replaces them with reference tokens.
114+
115+
<CaptionedImage
116+
src="/diagrams/codec-server-with-external-storage.svg"
117+
srcDark="/diagrams/codec-server-with-external-storage-dark.svg"
118+
width="100%"
119+
title="Codec Server with External Storage"
120+
/>
121+
122+
The following example walks through how all three endpoints work together:
123+
124+
1. A user starts a Workflow from the CLI with a plaintext input. The CLI sends the input to the Codec Server's `/encode`
125+
endpoint.
126+
2. The Codec Server encodes the payload through the Payload Codec. The encoded payload exceeds the storage threshold,
127+
so the Codec Server uploads it to external storage and returns a small reference token.
128+
3. The CLI sends the reference token to the Temporal Service, which stores it in the Event History.
129+
4. Later, a user views the Workflow in the Web UI. The Web UI retrieves the Event History from the Temporal Service and
130+
sends the payloads to the Codec Server's `/decode` endpoint with the `?preserveStorageRefs=true` query parameter.
131+
5. The Codec Server decodes any non-reference payloads through the Payload Codec, but returns storage references as-is.
132+
The Web UI displays the reference metadata, indicating the payload is stored externally.
133+
6. The user clicks to view the full payload. The Web UI sends the storage reference to the `/download` endpoint.
134+
7. The Codec Server retrieves the encoded payload from external storage, decodes it through the Payload Codec, and
135+
returns the plaintext result to the Web UI.
136+
82137
## Codec Server vs. Payload Codec
83138

84139
A Codec Server runs a [Payload Codec](/payload-codec) internally, so the two are directly connected. The difference is

docs/production-deployment/data-encryption.mdx

Lines changed: 29 additions & 45 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
---
22
id: data-encryption
3-
title: Codec Server - Temporal Platform feature guide
3+
title: Codecs and Encryption
44
sidebar_label: Codecs and Encryption
55
description: Encrypt data in Temporal Server to secure Workflow, Activity, and Worker information. Use custom Payload Codecs for encryption/decryption, set up Codec Servers for remote decoding, and ensure secure access.
66
slug: /production-deployment/data-encryption
@@ -18,10 +18,11 @@ tags:
1818

1919
import { CaptionedImage } from '@site/src/components';
2020

21-
Temporal Server stores and persists the data handled in your Workflow Execution.
22-
Encrypting this data ensures that any sensitive application data is secure when handled by the Temporal Server.
21+
The Temporal Service persists data from your Workflow Executions, including inputs, outputs, and results. To protect
22+
sensitive data, use a [Payload Codec](/payload-codec) to encrypt payloads before they reach the Temporal Service. With
23+
encryption enabled, data exists unencrypted only on the Client and the Worker process, on hosts that you control.
2324

24-
For example, if you have sensitive information passed in the following objects that are persisted in the Workflow Execution Event History, use encryption to secure it:
25+
The following data is persisted in the Event History and can be encrypted:
2526

2627
- Inputs and outputs/results in your [Workflow](/workflow-execution), [Activity](/activity-execution), and [Child Workflow](/child-workflows)
2728
- [Signal](/sending-messages#sending-signals) inputs
@@ -30,51 +31,30 @@ For example, if you have sensitive information passed in the following objects t
3031
- [Query](/sending-messages#sending-queries) inputs and results
3132
- Results of [Local Activities](/local-activity) and [Side Effects](/workflow-execution/event#side-effect)
3233
- [Application errors and failures](/references/failures).
33-
Failure messages and call stacks are not encoded as codec-capable Payloads by default; you must explicitly enable encoding these common attributes on failures.
34-
For more details, see [Failure Converter](/failure-converter).
34+
Failure messages and call stacks are not encoded as codec-capable Payloads by default; you must explicitly enable
35+
encoding these common attributes on failures. For more details, see [Failure Converter](/failure-converter).
3536

36-
Using encryption ensures that your sensitive data exists unencrypted only on the Client and the Worker Process that is executing the Workflows and Activities, on hosts that you control.
37+
To view encrypted data in the Web UI and CLI, set up a [Codec Server](/codec-server). The following sections cover how
38+
to set up a Codec Server and configure the Web UI and CLI to use it.
3739

38-
By default, your data is serialized to a [Payload](/dataconversion#payload) by a [Data Converter](/dataconversion).
39-
To encrypt your Payload, configure your custom encryption logic with a [Payload Codec](/payload-codec) and set it with a [custom Data Converter](/default-custom-data-converters#custom-data-converter).
40+
For encryption implementation examples, see the following samples:
4041

41-
A Payload Codec does byte-to-byte conversion to transform your Payload (for example, by implementing compression and/or encryption and decryption) and is an optional step that happens between the Client and the [Payload Converter](/payload-converter):
42-
43-
<CaptionedImage
44-
src="/diagrams/remote-data-encoding.svg"
45-
title="Remote data encoding architecture" />
46-
47-
You can run your Payload Codec with a [Codec Server](/codec-server) and use the Codec Server endpoints in the Web UI and CLI to decode your encrypted Payload locally.
48-
For details on how to set up a Codec Server, see [Codec Server setup](#codec-server-setup).
49-
50-
However, if you plan to set up [remote data encoding](/remote-data-encoding) for your data, ensure that you consider all security implications of running encryption remotely before implementing it.
51-
52-
When implementing a custom codec, it is recommended to perform your compression or encryption on the entire input Payload and store the result in the data field of a new Payload with a different encoding metadata field.
53-
This ensures that the input Payload's metadata is preserved.
54-
When the encoded Payload is sent to be decoded, you can verify the metadata field before applying the decryption.
55-
If your Payload is not encoded, it is recommended to pass the unencoded data to the decode function instead of failing the conversion.
56-
57-
Examples for implementing encryption:
58-
59-
- [Go sample](https://github.com/temporalio/samples-go/tree/main/encryption)
60-
- [Java sample](https://github.com/temporalio/samples-java/tree/main/core/src/main/java/io/temporal/samples/encryptedpayloads)
61-
- [Python sample](https://github.com/temporalio/samples-python/tree/main/encryption)
62-
- [TypeScript sample](https://github.com/temporalio/samples-typescript/tree/main/encryption)
63-
- [.NET sample](https://github.com/temporalio/samples-dotnet/tree/main/src/Encryption)
42+
- [Go](https://github.com/temporalio/samples-go/tree/main/encryption)
43+
- [Java](https://github.com/temporalio/samples-java/tree/main/core/src/main/java/io/temporal/samples/encryptedpayloads)
44+
- [Python](https://github.com/temporalio/samples-python/tree/main/encryption)
45+
- [TypeScript](https://github.com/temporalio/samples-typescript/tree/main/encryption)
46+
- [.NET](https://github.com/temporalio/samples-dotnet/tree/main/src/Encryption)
6447

6548
## Codec Server setup {#codec-server-setup}
6649

6750
Use a Codec Server to programmatically decode your encoded [payloads](/dataconversion#payload).
6851

6952
A Codec Server is an HTTP server that uses your custom Codec logic to decode your data remotely.
7053
The Codec Server is independent of the Temporal Service and decodes your encrypted payloads through predefined endpoints. You create, operate, and manage access to your Codec Server in your own environment.
71-
The Temporal CLI and the Web UI in turn provide built-in hooks to call the Codec Server to decode encrypted payloads on demand.
72-
73-
The Codec Server is independent of the Temporal Server and decodes your encrypted payloads through endpoints.
74-
When you configure a Codec Server endpoint in the Temporal Web UI or CLI, the Web UI and CLI use the remote endpoint to receive decoded payloads from the Codec Server.
54+
When you configure a Codec Server endpoint in the Web UI or CLI, the Web UI and CLI use the remote endpoint to send and receive payloads from the Codec Server.
7555
See [API contract requirements](#api-contract-specifications).
7656

77-
Decoded payloads can then be displayed in the Workflow Execution Event History on the Web UI. Note that when you use a Codec Server, the decoded payloads are decoded and returned on the client side only; payloads on the Temporal Server (whether on Temporal Cloud or a self-hosted Temporal Service) remain encrypted.
57+
Decoded payloads can then be displayed in the Workflow Execution Event History on the Web UI. When you use a Codec Server, the decoded payloads are decoded and returned on the client side only. Payloads on the Temporal Service (whether on Temporal Cloud or self-hosted) remain encrypted.
7858

7959
Because you create, operate, and manage access to your Codec Server in your controlled environment, ensure that you consider the following:
8060

@@ -91,7 +71,13 @@ When you create your Codec Server to handle requests from the Web UI, the follow
9171

9272
#### Endpoints
9373

94-
The Web UI and CLI send a POST to a `/decode` endpoint. In your Codec Server, create a `/decode` path and pass the incoming payload to the decode method in your Payload Codec.
74+
The Web UI and CLI send POST requests to the following endpoints on your Codec Server:
75+
76+
- `/decode` passes incoming payloads to the decode method in your Payload Codec.
77+
- `/encode` passes incoming payloads to the encode method in your Payload Codec.
78+
- `/download` retrieves and decodes payloads from [External Storage](/external-storage). This endpoint is only needed if
79+
your Workers use External Storage. See [Codec Server with External Storage](/codec-server#external-storage) for
80+
details.
9581

9682
For examples on how to create your Codec Server, see the following Codec Server implementation samples:
9783

@@ -346,14 +332,12 @@ temporal workflow show \
346332
--codec-auth 'auth-header'
347333
```
348334

349-
### Working with Large Payloads
350-
351-
Codec Servers can be used for more than encryption and decryption of sensitive data.
352-
Codec Server behavior is left up to implementers -- they can also call external services or perform other tasks, as long as they hook in at the encoding and decoding stages of a Workflow payload.
335+
### Working with large payloads
353336

354-
By default, Temporal limits payload size to 4MB.
355-
If this limitation is problematic for your use case, you could implement a codec that persists your payloads to an object store outside of workflow histories.
356-
An example implementation is available from [DataDog](https://github.com/DataDog/temporal-large-payload-codec).
337+
If your payloads exceed the Temporal Service's size limits, use [External Storage](/external-storage) to offload large
338+
payloads to an external store like Amazon S3. When External Storage is configured, your Codec Server can also retrieve
339+
and decode these payloads for viewing in the Web UI and CLI. See
340+
[Codec Server with External Storage](/codec-server#external-storage) for details.
357341

358342
### Temporal Nexus
359343

static/diagrams/codec-server-with-external-storage-dark.svg

Lines changed: 1 addition & 0 deletions
Loading

static/diagrams/codec-server-with-external-storage.svg

Lines changed: 1 addition & 0 deletions
Loading

0 commit comments

Comments
 (0)