Skip to content

Comments

Add leader election to reconciler#672

Open
AMacedoP wants to merge 1 commit intok8snetworkplumbingwg:masterfrom
AMacedoP:feature/reconciler_leader_election
Open

Add leader election to reconciler#672
AMacedoP wants to merge 1 commit intok8snetworkplumbingwg:masterfrom
AMacedoP:feature/reconciler_leader_election

Conversation

@AMacedoP
Copy link

What this PR does / why we need it:
Adds leader election to the reconciler execution to ensure that it runs on only one pod of the whereabouts daemonset.

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #386.

Special notes for your reviewer (optional):

Guard the reconciler execution behind a leader election lock to ensure
that it runs on one pod of the daemonset. This prevents high memory
usage on the whereabouts pod and apiserver when running on large
clusters.

Fixes k8snetworkplumbingwg#386.

Signed-off-by: Alejandro Macedo <alex.macedopereira@gmail.com>
@gemini-code-assist
Copy link

Summary of Changes

Hello @AMacedoP, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a robust leader election mechanism for the IP reconciler component of Whereabouts. By leveraging Kubernetes' native leader election capabilities, it guarantees that the reconciler's scheduled tasks, such as IP reconciliation, are executed by a single, active leader pod within the daemonset. This enhancement significantly improves the stability and correctness of the IP management by preventing concurrent modifications and ensuring consistent state across the cluster.

Highlights

  • Leader Election Implementation: Implemented Kubernetes leader election for the IP reconciler to ensure that only one instance of the reconciler runs at any given time across the Whereabouts daemonset. This prevents race conditions and redundant operations when multiple pods are running.
  • Reconciler Logic Refactoring: The existing cron scheduler and configuration watcher logic for the IP reconciler has been refactored into a dedicated function, runScheduledReconciler, which is now invoked only by the elected leader.
  • Dynamic Leader Identity: Added logic to dynamically determine the leader's identity using environment variables like POD_NAME, hostname, or NODENAME, which is then used in the leader election process.
  • Daemonset Configuration Update: Updated the daemonset configurations (Helm chart and direct YAML) to expose the pod's name as an environment variable (POD_NAME) to facilitate leader election identity.
Changelog
  • cmd/controlloop/controlloop.go
    • Added context, metav1, leaderelection, and resourcelock imports for Kubernetes leader election.
    • Introduced new constants for leader election configuration, including lease name, duration, renew deadline, and retry period.
    • Refactored the main function to initiate a leader election loop in a goroutine.
    • Created runReconcilerLeaderElectionLoop to set up the Kubernetes client and context for leader election.
    • Implemented runReconcilerLeaderElection to configure and start the leader elector with callbacks for leadership events.
    • Moved the cron scheduler and configuration watcher logic into a new runScheduledReconciler function, which is called when a pod becomes the leader.
    • Added helper functions whereaboutsNamespace and reconcilerLeaderIdentity to determine the namespace and unique identity for leader election.
  • deployment/whereabouts-chart/templates/daemonset.yaml
    • Added the POD_NAME environment variable to the daemonset template, sourcing its value from metadata.name.
  • doc/crds/daemonset-install.yaml
    • Added the POD_NAME environment variable to the example daemonset manifest, sourcing its value from metadata.name.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces leader election for the reconciler, ensuring it runs on a single pod within the daemonset. A security review found no immediate vulnerabilities. The implementation correctly utilizes the client-go leader election mechanism and is well-organized, with one minor suggestion for code clarity and adherence to idiomatic Go practices.

Comment on lines +116 to +128
electionCtx, cancelElection := context.WithCancel(ctx)
err = runReconcilerLeaderElection(
electionCtx,
k8sClientSet,
namespace,
identity,
errorChan,
cancelElection,
)
cancelElection()
if err != nil {
errorChan <- err
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The call to cancelElection() on line 125 is redundant. The runReconcilerLeaderElection function is a blocking call that only returns after its context (electionCtx) has been canceled. Therefore, calling cancelElection() again is a no-op. Using defer is a more idiomatic and cleaner way to handle context cancellation, which improves code maintainability.

 	electionCtx, cancelElection := context.WithCancel(ctx)
 	defer cancelElection()
 
 	if err := runReconcilerLeaderElection(
 		electionCtx,
 		k8sClientSet,
 		namespace,
 		identity,
 		errorChan,
 		cancelElection,
 	); err != nil {
 		errorChan <- err
 	}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Memory limit of 200Mi for the whereabouts daemon-set inadequate for large clusters

1 participant