Skip to content

Predicate filters will typically lead you to miss deletions #1768

@jackkleeman

Description

@jackkleeman

Current and expected behavior

It appears to me that the objects that come with kube watch deletion events don't have a deletion timestamp set, or any other change that might allow you to conclude that its been deleted. When you use .touched_objects(), you totally lose the context that this object is going through a deletion, its like any other reconciliation - there are good reasons for this. However, if you are using a predicate (which leads events to be dropped when the object is identical) you will typically drop deletions, as they have no change in the objects except for resourceVersion, which I typically wouldn't look at in a predicate.

I don't know for sure if this is a bug or just surprising behaviour, but its definitely very easy to have predicates that miss deletions. What it means in practice is that my operators won't recreate deleted resources, at least until some periodic timer fires. I wonder should the predicate have access to some context that this is a deletion, in which case perhaps the predicate shouldn't apply, or can be allowed to return a different hash value?

Possible solution

Here's my workaround:

let pdb_watcher = metadata_watcher(pdb_api, cfg.clone())
        .map(|event| ensure_deletion_change(event))
        .touched_objects()
        .predicate_filter(changed_predicate);

fn ensure_deletion_change<K: Resource, E>(
    mut event: Result<kube::runtime::watcher::Event<K>, E>,
) -> Result<kube::runtime::watcher::Event<K>, E> {
    if let Ok(kube::runtime::watcher::Event::Delete(ref mut object)) = event {
        let meta = object.meta_mut();
        meta.generation = match meta.generation {
            Some(val) => Some(val + 1),
            None => Some(0),
        }
    }
    event
}

Essentially I increment generation for deletion events so that I know there has been an observed change which my predicate can check for. An alternative would be to update the deletion timestamp to be now (which will generally not be exactly identical to an existing deletion timestamp on the object, if any)

Additional context

No response

Environment

.

Configuration and features

No response

Affected crates

kube-runtime

Would you like to work on fixing this bug?

maybe

Metadata

Metadata

Assignees

No one assigned

    Labels

    ergonomicsergonomics of the public interfaceruntimecontroller runtime related

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions