Skip to content
This repository was archived by the owner on Sep 2, 2022. It is now read-only.
This repository was archived by the owner on Sep 2, 2022. It is now read-only.

Savepoints flow changes #404

Open
Open
@shashken

Description

@shashken

Hey guys, following the comments from @functicons @elanv on my PR: #392 and elanv's fix: #401 for the intermediate fix for Savepoints (which had more than 1 triggered) I would like to have a discussion here before moving on to solving this issue.
I have few things I want to bring up:

  1. We need to figure out how we want to make sure more than 1 SP is not triggered at a given time. right now I implemented a TriggerTime mechanism that does not allow more than X triggers in a given period.
    Maybe we want to have a different approach and check the jobamanager and see if there is a SP/CP in progress and don't allow a SP trigger if there is an active SP/CP.
    I would like your opinion on what to do and where to implement this as you both know the operator better than me @functicons @elanv

  2. I think its very important to make an optional flag for the operator to not allow a job submit without savepoint and i'll explain:
    The configuration to update a job with an old SP/trigger a new one and restart is the same as the configuration when you submit without the CRD existing, for some cases a job start with no savepoint(state) is devastating and will cause corrupted results.
    With the flag the job will simply finish in an error if no savepoint is retrieved.
    This case will get pretty common when using the operator with a CD solution that makes install/upgrade configuration basically the same.

  3. I want there to be a way to trigger a job restart and SP with a CRD update (right now I found that when I change external configmap I have to change parallelism as well for the operator to notice change and trigger SP+restart) any suggestion here?

Those might not be best solved in a single PR, but the strongest shortage I felt when using the operator was this part regarding Savepoints, everything else works very well for me :)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions