feat: add TTL strategy to Argo WorkflowTemplates by npow · Pull Request #3104 · Netflix/metaflow

npow · 2026-04-13T21:55:33Z

Summary

Add TTL strategy support to Argo WorkflowTemplates via METAFLOW_ARGO_WORKFLOW_TTL_SECONDS config
Automatically cleans up completed workflow pods after the configured TTL, reducing cluster resource usage
Fix: revert CronWorkflow schedules array back to schedule string field for broader Argo version compatibility

Test plan

test/unit/test_argo_ttl_strategy.py — unit tests for TTL strategy configuration

🤖 Generated with Claude Code

… use schedules Add a configurable ttlStrategy to Argo WorkflowTemplates so completed workflows are automatically cleaned up after 7 days (matching the default Kubernetes job TTL). Configurable via METAFLOW_ARGO_WORKFLOWS_TTL_SECONDS_AFTER_COMPLETION env var; set to 0 to disable. Also update CronWorkflow creation to use the `schedules` array field instead of the deprecated singular `schedule` field, per Argo Workflows deprecation notice. Closes #1231, closes #2351

The schedules array field requires Argo Workflows v3.6+ and breaks older installations. Keep the singular schedule field which is supported across all versions.

greptile-apps · 2026-04-13T21:58:20Z

Greptile Summary

This PR adds ttlStrategy support to Argo WorkflowTemplate specs, controlled by a new METAFLOW_ARGO_WORKFLOWS_TTL_SECONDS_AFTER_COMPLETION config variable (default: 7 days). The implementation is clean and well-tested, but the non-zero default is a silent behavior change — existing users who upgrade will have completed workflow resources auto-deleted after 7 days without any explicit opt-in.

Confidence Score: 5/5

Safe to merge; all findings are P2 style/suggestion level.

The implementation is correct and tested. The two findings are both P2: one is a design preference (default TTL value), and the other is a minor code quality improvement (double int() call). Neither blocks correctness.

Review the default value in metaflow/metaflow_config.py — consider whether 0 (opt-out) is safer than 7 days (opt-in) for the default TTL.

Important Files Changed

Filename	Overview
metaflow/metaflow_config.py	Adds `ARGO_WORKFLOWS_TTL_SECONDS_AFTER_COMPLETION` config; default of 7 days is a silent behavior change for existing users.
metaflow/plugins/argo/argo_workflows.py	Adds `ttl_strategy()` to `WorkflowSpec` and wires it into the WorkflowTemplate builder; logic is correct, minor `int()` redundancy.
test/unit/test_argo_ttl_strategy.py	Good coverage of positive, zero, None, negative, string-input, and chaining cases for `ttl_strategy()`.

_{Reviews (1): Last reviewed commit: "test: add unit tests for WorkflowSpec.tt..." | Re-trigger Greptile}

greptile-apps · 2026-04-13T21:58:24Z

+ARGO_WORKFLOWS_TTL_SECONDS_AFTER_COMPLETION = from_conf(
+    "ARGO_WORKFLOWS_TTL_SECONDS_AFTER_COMPLETION", 7 * 24 * 60 * 60
+)


Opt-in default TTL is a behavior change for existing users

The default of 7 days means every existing Metaflow user who upgrades will silently start having their completed Argo Workflow resources auto-deleted after 7 days. Users who rely on workflow history beyond that window (e.g. for audit, debugging, or custom tooling that queries old workflow objects) will lose it without any explicit action. A safer default might be None or 0 (disable TTL) to preserve the existing behavior, with opt-in for cleanup.

greptile-apps · 2026-04-13T21:58:25Z

+    def ttl_strategy(self, seconds_after_completion):
+        # https://argoproj.github.io/argo-workflows/fields/#ttlstrategy
+        if seconds_after_completion is not None and int(seconds_after_completion) > 0:
+            self.payload["ttlStrategy"] = {
+                "secondsAfterCompletion": int(seconds_after_completion)
+            }
+        return self


int() called twice and raises unhandled ValueError on bad env-var input

int(seconds_after_completion) is evaluated twice (once in the condition, once in the payload). More importantly, if a user sets METAFLOW_ARGO_WORKFLOWS_TTL_SECONDS_AFTER_COMPLETION=invalid in their environment, int("invalid") will raise an unhandled ValueError at deploy time with no helpful message. Converting and validating once at the top is cleaner:

Suggested change

def ttl_strategy(self, seconds_after_completion):

# https://argoproj.github.io/argo-workflows/fields/#ttlstrategy

if seconds_after_completion is not None and int(seconds_after_completion) > 0:

self.payload["ttlStrategy"] = {

"secondsAfterCompletion": int(seconds_after_completion)

}

return self

def ttl_strategy(self, seconds_after_completion):

# https://argoproj.github.io/argo-workflows/fields/#ttlstrategy

if seconds_after_completion is not None:

try:

seconds = int(seconds_after_completion)

except (TypeError, ValueError):

seconds = 0

if seconds > 0:

self.payload["ttlStrategy"] = {

"secondsAfterCompletion": seconds

}

return self

Nissan Pow added 4 commits April 13, 2026 14:55

fix: revert CronWorkflow schedules back to schedule field

b412523

The schedules array field requires Argo Workflows v3.6+ and breaks older installations. Keep the singular schedule field which is supported across all versions.

test: add unit tests for WorkflowSpec.ttl_strategy()

510a5ba

test: add unit tests for WorkflowSpec.ttl_strategy()

21bb155

greptile-apps Bot reviewed Apr 13, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add TTL strategy to Argo WorkflowTemplates#3104

feat: add TTL strategy to Argo WorkflowTemplates#3104
npow wants to merge 4 commits intomasterfrom
npow/argo-ttl-schedules-clean

npow commented Apr 13, 2026

Uh oh!

greptile-apps Bot commented Apr 13, 2026

Uh oh!

greptile-apps Bot Apr 13, 2026

Uh oh!

greptile-apps Bot Apr 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

npow commented Apr 13, 2026

Summary

Test plan

Uh oh!

greptile-apps Bot commented Apr 13, 2026

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Uh oh!

greptile-apps Bot Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant