Skip to content

Failed GitJobs may require manual deletion to unblock new executions #4809

@aruiz14

Description

@aruiz14

Is there an existing issue for this?

  • I have searched the existing issues

Current Behavior

While syncing with @sbulage on verifying #4718, we observed that the Job from the offending Fleet CLI execution was Error state, and remained after the message was obtained from the Fleet controller.
Also, upgrading to a new version (the RC containing the fix) or Force Updating the GitRepo didn't cause any actions, as this Job already existed.

Expected Behavior

We should revisit this behavior to either:

  • Make Fleet CLI not return a non-zero exit code, to avoid this from happening. The failure result can be written as part of the JSON result payload.
  • Extend the mechanism that removes Successful jobs to also remove failed ones.
  • Revisit the cleanup CronJob to not only delete successful jobs.

Steps To Reproduce

  1. Create a GitRepo with some intentional mistakes, so that its GitJob will not Succeed.
  2. Confirm that the Fleet CLI Job is created and remains in Error state.
  3. Force Update the GitRepo in an attempt to rerun this job.

Environment

- Fleet Version: <=0.14.4

Logs

Anything else?

No response

Metadata

Metadata

Assignees

No one assigned

    Type

    Projects

    Status

    To Triage

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions