Skip to content

Allow configuration of max attempts for a task #2276

Open
@dhpikolo

Description

@dhpikolo

Currently, a user can attempt to run a specific task up to a maximum of 6 times. It would be beneficial to make this value configurable.

In our use case, we are working on integrating Argo retries with Metaflow’s retried Argo workflows. This environment variable would allow us to set a limit on how many times a user can retry an Argo workflow.

That said, beyond our specific use case, adding this configuration flexibility would be generally useful.

Current Behaviour

import pandas as pd
from metaflow import (
    FlowSpec,
    Parameter,
    card,
    project,
    step,
    retry
)


@project(name="dummy_project")
class HelloWorld(FlowSpec):
    force_error = Parameter("force-error", type=bool, default=False)

    @card
    @step
    def start(self):
        print("something")
        self.next(self.end)

    @card
    @retry(times=10)
    @step
    def end(self):
        if self.force_error:
            raise Exception("Testing errors in metaflow")
        print(f"the data artifact is: {self.my_var}")


if __name__ == "__main__":
    HelloWorld()
  • Running the above flow locally via python hello_world.py run throws the following exception
Metaflow 2.14.0 executing HelloWorld for user:j.kollipara
Project: dummy_project, Branch: user.j.kollipara
Validating your flow...
    The graph looks good!
Running pylint...
    Pylint is happy!
    Flow failed:
    The maximum number of retries is @retry(times=4).

error: Recipe `_poetry-run` failed with exit code 1

Source code of the above error:

def step_init(self, flow, graph, step, decos, environment, flow_datastore, logger):
# The total number of attempts must not exceed MAX_ATTEMPTS.
# attempts = normal task (1) + retries (N) + @catch fallback (1)
if int(self.attributes["times"]) + 2 > MAX_ATTEMPTS:
raise MetaflowException(
"The maximum number of retries is "
"@retry(times=%d)." % (MAX_ATTEMPTS - 2)
)

Proposed Behaviour

Setting METAFLOW_MAX_ATTEMPTS=12 would allow users to run the above flow.

Activity

dhpikolo

dhpikolo commented on Feb 18, 2025

@dhpikolo
Author

I have already put up a PR with the proposed change, let me know what you guys would think of it.

linked a pull request that will close this issue on Feb 19, 2025
dhpikolo

dhpikolo commented on Feb 19, 2025

@dhpikolo
Author

Created a new PR, since the old PR was based on development branch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      Allow configuration of max attempts for a task · Issue #2276 · Netflix/metaflow