Skip to content

Commit e15ef4e

Browse files
authored
Improve task_history documentation (#1461)
- Fix broken self-referencing link to point to runtime.task_history reference - Add Configuration section with retention and output capture parameters - Add Querying Task History section with SQL access examples - Add Persisting Task History section with worker cron backup example
1 parent 820afcb commit e15ef4e

2 files changed

Lines changed: 118 additions & 2 deletions

File tree

website/docs/reference/task_history.md

Lines changed: 59 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,65 @@ The Spice runtime stores information about completed tasks in the `spice.runtime
1010

1111
A span is a unit of trace data that encapsulates the details of a task's execution, including its duration, inputs, and outputs. Spans enable hierarchical tracing by grouping tasks under a parent span, which provides a view of task dependencies and the overall execution flow.
1212

13-
To learn more about task history configuration read [task_history](./task_history).
13+
For configuration options, see the [`runtime.task_history` reference](./spicepod/runtime#runtimetask_history).
14+
15+
## Configuration
16+
17+
Task history is enabled by default with a 30-minute retention period. To adjust retention and other settings, configure the `runtime.task_history` section in `spicepod.yaml`:
18+
19+
```yaml
20+
runtime:
21+
task_history:
22+
enabled: true
23+
captured_output: none
24+
retention_period: 8h
25+
retention_check_interval: 15m
26+
```
27+
28+
- **`enabled`**: Enable or disable task history. Defaults to `true`.
29+
- **`captured_output`**: Level of output captured. Defaults to `none`.
30+
- **`retention_period`**: How long records are retained. Defaults to `8h`. Longer retention periods increase memory usage.
31+
- **`retention_check_interval`**: How often old records are checked for removal. Defaults to `15m`.
32+
33+
For the full list of parameters, see the [`runtime.task_history` reference](./spicepod/runtime#runtimetask_history).
34+
35+
## Querying Task History
36+
37+
Task history is queryable as a standard SQL table at `runtime.task_history` (or `spice.runtime.task_history`). To retrieve all recorded tasks, run:
38+
39+
```sql
40+
SELECT * FROM runtime.task_history;
41+
```
42+
43+
This query can be issued through any Spice SQL interface, including the [HTTP API](../api/HTTP/post-sql), [Arrow Flight SQL](../api/arrow-flight-sql), or the [Spice SQL REPL](../cli/reference/sql).
44+
45+
## Persisting Task History
46+
47+
Task history is stored in-memory and subject to the configured `retention_period`. To persist task history beyond the retention window, set up a [worker](./spicepod/workers) with a cron schedule that periodically writes records to an external dataset.
48+
49+
For example, to back up task history to an Iceberg table every 10 minutes:
50+
51+
```yaml
52+
datasets:
53+
- from: glue:team_app.task_history
54+
name: task_history_sink
55+
mode: read_write
56+
params:
57+
glue_auth: key
58+
glue_region: us-east-1
59+
glue_key: ${secrets:AWS_GLUE_ACCESS_KEY}
60+
glue_secret: ${secrets:AWS_GLUE_SECRET_ACCESS_KEY}
61+
62+
workers:
63+
- name: backup-task-history
64+
cron: '*/10 * * * *' # every 10 minutes
65+
sql: |
66+
INSERT INTO task_history_sink
67+
SELECT * FROM runtime.task_history
68+
WHERE start_time >= NOW() - INTERVAL '10' MINUTE;
69+
```
70+
71+
This approach writes recent task history records to a durable store on a regular schedule, ensuring data is available for later analysis even after the in-memory retention window expires.
1472

1573
## Table Schema
1674

website/versioned_docs/version-1.11.x/reference/task_history.md

Lines changed: 59 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,65 @@ The Spice runtime stores information about completed tasks in the `spice.runtime
1010

1111
A span is a unit of trace data that encapsulates the details of a task's execution, including its duration, inputs, and outputs. Spans enable hierarchical tracing by grouping tasks under a parent span, which provides a view of task dependencies and the overall execution flow.
1212

13-
To learn more about task history configuration read [task_history](./task_history).
13+
For configuration options, see the [`runtime.task_history` reference](./spicepod/runtime#runtimetask_history).
14+
15+
## Configuration
16+
17+
Task history is enabled by default with a 30-minute retention period. To adjust retention and other settings, configure the `runtime.task_history` section in `spicepod.yaml`:
18+
19+
```yaml
20+
runtime:
21+
task_history:
22+
enabled: true
23+
captured_output: none
24+
retention_period: 8h
25+
retention_check_interval: 15m
26+
```
27+
28+
- **`enabled`**: Enable or disable task history. Defaults to `true`.
29+
- **`captured_output`**: Level of output captured. Defaults to `none`.
30+
- **`retention_period`**: How long records are retained. Defaults to `8h`. Longer retention periods increase memory usage.
31+
- **`retention_check_interval`**: How often old records are checked for removal. Defaults to `15m`.
32+
33+
For the full list of parameters, see the [`runtime.task_history` reference](./spicepod/runtime#runtimetask_history).
34+
35+
## Querying Task History
36+
37+
Task history is queryable as a standard SQL table at `runtime.task_history` (or `spice.runtime.task_history`). To retrieve all recorded tasks, run:
38+
39+
```sql
40+
SELECT * FROM runtime.task_history;
41+
```
42+
43+
This query can be issued through any Spice SQL interface, including the [HTTP API](../api/HTTP/post-sql), [Arrow Flight SQL](../api/arrow-flight-sql), or the [Spice SQL REPL](../cli/reference/sql).
44+
45+
## Persisting Task History
46+
47+
Task history is stored in-memory and subject to the configured `retention_period`. To persist task history beyond the retention window, set up a [worker](./spicepod/workers) with a cron schedule that periodically writes records to an external dataset.
48+
49+
For example, to back up task history to an Iceberg table every 10 minutes:
50+
51+
```yaml
52+
datasets:
53+
- from: glue:team_app.task_history
54+
name: task_history_sink
55+
mode: read_write
56+
params:
57+
glue_auth: key
58+
glue_region: us-east-1
59+
glue_key: ${secrets:AWS_GLUE_ACCESS_KEY}
60+
glue_secret: ${secrets:AWS_GLUE_SECRET_ACCESS_KEY}
61+
62+
workers:
63+
- name: backup-task-history
64+
cron: '*/10 * * * *' # every 10 minutes
65+
sql: |
66+
INSERT INTO task_history_sink
67+
SELECT * FROM runtime.task_history
68+
WHERE start_time >= NOW() - INTERVAL '10' MINUTE;
69+
```
70+
71+
This approach writes recent task history records to a durable store on a regular schedule, ensuring data is available for later analysis even after the in-memory retention window expires.
1472

1573
## Table Schema
1674

0 commit comments

Comments
 (0)