Skip to content

cluster status update interval inconsistent with --cluster-status-update-frequency in both push and pull mode #6281

Open
@LivingCcj

Description

@LivingCcj

When the sub cluster frequently add and delete workload, karmad-agent will update cluster status frequently, that is inconsistent with the configured via the cluster-status-update-frequency args.

What happened:
when cluster status be changed frequently, the karamda-controller-manager deal with the condition of cluster object frequently.

What you expected to happen:
the interval of cluster.status updated by karmad-agent should be inconsistent with the cluster-status-update-frequency (default 10s)

How to reproduce it (as minimally and precisely as possible):

Environment:

  • Karmada version: v1.20.10

Activity

added
kind/bugCategorizes issue or PR as related to a bug.
on Apr 9, 2025
liangyuanpeng

liangyuanpeng commented on Apr 9, 2025

@liangyuanpeng
Contributor

Karmada version: v1.20.10

seems like you put the wrong version here,1.31 is karmada latest version, what's your karmada version.

RainbowMango

RainbowMango commented on Apr 10, 2025

@RainbowMango
Member

@LivingCcj have you figured out the root cause? It would be great if you could point out the code that doesn't work as expected.

LivingCcj

LivingCcj commented on Apr 10, 2025

@LivingCcj
ContributorAuthor

At Predicate step, the UndateFunc should ignore the requeue if only cluster.status changed. Only if cluster_status controller requeue cluster object only by the interval of cluster-status-update-frequency

In pull mode, the Predicate func for karmada-agent.

// NewClusterPredicateOnAgent generates an event filter function with Cluster for karmada-agent.
func NewClusterPredicateOnAgent(clusterName string) predicate.Funcs {
return predicate.Funcs{
CreateFunc: func(createEvent event.CreateEvent) bool {
return createEvent.Object.GetName() == clusterName
},
UpdateFunc: func(updateEvent event.UpdateEvent) bool {
return updateEvent.ObjectOld.GetName() == clusterName
},
DeleteFunc: func(deleteEvent event.DeleteEvent) bool {
return deleteEvent.Object.GetName() == clusterName
},
GenericFunc: func(event.GenericEvent) bool {
return false
},
}
}

In push mode, the Predicate func for karmada-controller-manager.

clusterPredicateFunc := predicate.Funcs{
CreateFunc: func(createEvent event.CreateEvent) bool {
obj := createEvent.Object.(*clusterv1alpha1.Cluster)
if obj.Spec.SecretRef == nil {
return false
}
return obj.Spec.SyncMode == clusterv1alpha1.Push
},
UpdateFunc: func(updateEvent event.UpdateEvent) bool {
obj := updateEvent.ObjectNew.(*clusterv1alpha1.Cluster)
if obj.Spec.SecretRef == nil {
return false
}
return obj.Spec.SyncMode == clusterv1alpha1.Push
},
DeleteFunc: func(deleteEvent event.DeleteEvent) bool {
obj := deleteEvent.Object.(*clusterv1alpha1.Cluster)
if obj.Spec.SecretRef == nil {
return false
}
return obj.Spec.SyncMode == clusterv1alpha1.Push
},
GenericFunc: func(event.GenericEvent) bool {
return false
},
}

RainbowMango

RainbowMango commented on Apr 10, 2025

@RainbowMango
Member

Are you saying that for both pull mode and push mode, the cluster-status-controller updates the Cluster status continuously, regardless of the --cluster-status-update-frequency flag?

If so, it would be a serious mistake that affects the performance heavily.

cc @zach593 @CharlesQQ take a look

LivingCcj

LivingCcj commented on Apr 10, 2025

@LivingCcj
ContributorAuthor

yeah,just like your saying

LivingCcj

LivingCcj commented on Apr 10, 2025

@LivingCcj
ContributorAuthor

this pr which could fix the issue

CharlesQQ

CharlesQQ commented on Apr 10, 2025

@CharlesQQ
Member

I add and delete workload for many times, but not find the update the Cluster status continuously, which same as --cluster-status-update-frequency, @LivingCcj Could you please detailed description on the steps to reproduce your problem?

Image
RainbowMango

RainbowMango commented on Apr 10, 2025

@RainbowMango
Member

@CharlesQQ You've done the test I wanted to do! Thank you very much!

LivingCcj

LivingCcj commented on Apr 10, 2025

@LivingCcj
ContributorAuthor

Thank you for attentions firstly @RainbowMango @CharlesQQ.
This test scenario needs to add or delete workload frequently in member cluster. like the shell script in member cluster

kubectl delete -f deployment.yaml
sleep 1
kubectl apply -f deployment.yaml
for i in {1..1000}; do
    sleep 1
    random_number=$((RANDOM % 11))
    echo $random_number
    kubectl scale -f deployment.yaml --replicas=$random_number
done

kubectl delete -f deployment.yaml
RainbowMango

RainbowMango commented on Apr 10, 2025

@RainbowMango
Member

@LivingCcj can you share some logs? Like @CharlesQQ posted above, showing the sync timeline.

LivingCcj

LivingCcj commented on Apr 10, 2025

@LivingCcj
ContributorAuthor

Load some log from prod member cluster, the period of requeue is less more
Image

CharlesQQ

CharlesQQ commented on Apr 11, 2025

@CharlesQQ
Member

@LivingCcj Can you figure out which cluster status fields are changing?

11 remaining items

Loading
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

Labels

kind/bugCategorizes issue or PR as related to a bug.

Type

No type

Projects

Status

No status

Relationships

None yet

    Participants

    @RainbowMango@CharlesQQ@zach593@liangyuanpeng@LivingCcj

    Issue actions

      cluster status update interval inconsistent with --cluster-status-update-frequency in both push and pull mode · Issue #6281 · karmada-io/karmada