Implement Consistent Hashing with Bounded Loads #9239

kirs · 2022-10-31T21:48:51Z

This PR implements Consistent Hashing with Bounded Loads algorithm outlined in the paper. The same algorithm is employed by Haproxy and Envoy which made it easier for me to follow an existing implementation and port it into ingress-nginx. This chains nicely to chash and chashsubset balancers that are available in ingress-nginx.

What's the goal?

The goal here is to keep routing requests for the same tenant to same endpoints - as long as that doesn't overload that pod, and if it does, spill over to neighbour pods in the ring.

This is relevant specifically for multi-tenant web apps where each request is annotated with X-Account-Id or whatever and you can leverage better caching in the upstream if same container ends up serving the same account most of the time.

How does it work?

For the specified hash_balance_factor, requests to any upstream host are capped at hash_balance_factor times the average number of requests across the cluster. When a request arrives for an upstream host that is currently serving at its max capacity, linear probing is used to identify an eligible host in the ring. For example, with hash_balance_factor=2, each host will be allowed to burst 2x above the average load in the cluster. If the first pick of consistent hash in the ring is loaded above that threshold, it will move on to the next node in the ring.

How do you track the load?

Same as Envoy and Haproxy, we keep track of active requests in total and per each node. That allows us to do the math according to the strategy from above and to figure if the host is overloaded or not.

How do you know it works?

This approach has been tested at Shopify for the past month and it's been showing good data. This is a backport from our private fork that we'd like to contribute back to upstream.

@ElvinEfendi

k8s-ci-robot · 2022-10-31T21:48:58Z

@kirs: This issue is currently awaiting triage.

If Ingress contributors determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot · 2022-10-31T21:49:00Z

Hi @kirs. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

ElvinEfendi · 2022-10-31T22:22:13Z

@rikatz @strongjz @tao12345666333 I've already reviewed this in detail in our fork. I'll let you decide if you want to merge this or when do you merge this. But I approve it.

longwuyuan · 2022-10-31T23:55:56Z

@ElvinEfendi @kirs , is it possible to see the numbers like requests received and routed etc on a kind cluster on a laptop with 16GB RAM 4-6cores, by cloning the fork/branch by @kirs

kirs · 2022-11-01T00:41:08Z

@longwuyuan we can, but I'm curious what kind of data you're expecting to see there?
Are you curious to compare it to other balancers?

In general, this one is pretty specific in terms of packing tenants to be served by same endpoints. We shouldn't expect it to be comparable with EWMA or round-robin because it optimized for different things.

longwuyuan · 2022-11-01T02:14:49Z

@kirs just wanted to put prometheus-operator and watch requests coming and routed to same pod, in a multiple replica workload, under load

longwuyuan · 2022-11-01T02:15:19Z

/ok-to-test

longwuyuan · 2022-11-01T02:17:28Z

@kirs , also just checked the commits, am not a developer, so explicitly asking if this occurs quietly for all use-cases or is there a toggle/other switch/aannotation etc. to enable/disable this useful feeature

tao12345666333 · 2022-11-01T03:24:00Z

/retest

ElvinEfendi · 2022-11-01T15:18:04Z

rootfs/etc/nginx/lua/balancer/chashboundedloads.lua

+
+local util = require("util")
+local split = require("util.split")
+local reverse_table = util.reverse_table


We need to also include reverse_table implementation in this PR.

ElvinEfendi · 2022-11-01T15:18:27Z

rootfs/etc/nginx/lua/balancer/chashboundedloads.lua

+local ngx_ERR = ngx.ERR
+local ngx_WARN = ngx.WARN
+local ngx_log = ngx.log
+local tostring = tostring


This is unused.

kirs · 2022-11-01T16:58:59Z

@longwuyuan this feature is opt-in. It is controlled like the rest of load balancers through annotations like:

    nginx.ingress.kubernetes.io/load-balancer: "chashboundedloads"
    nginx.ingress.kubernetes.io/upstream-hash-by: "$arg_predictorid" <- the variable you want to hash by
    nginx.ingress.kubernetes.io/upstream-hash-by-balance-factor: "1.5"

tao12345666333 · 2022-11-01T22:16:07Z

I did a quick review and it looks good.

I will review it in detail as soon as possible, especially the test cases

/assign

kirs · 2022-11-14T22:43:13Z

@tao12345666333 let me know if you have any further questions about tests or implementation 👂

k8s-ci-robot · 2023-02-02T17:34:57Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: kirs
Once this PR has been reviewed and has the lgtm label, please ask for approval from tao12345666333 by writing /assign @tao12345666333 in a comment. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

kirs · 2023-02-02T17:39:31Z

I updated the implementation with the latest version that we've been running in production. This one includes the seeding logic.

ElvinEfendi · 2023-02-03T03:30:55Z

@kirs it'd also be useful to add an e2e test case to

ingress-nginx/test/e2e/annotations/upstreamhashby.go

Line 101 in 0a0a3fe

})

. It seems straightforward enough to include that in this PR too.

tao12345666333 · 2023-02-03T09:59:41Z

rootfs/etc/nginx/lua/balancer/chashboundedloads.lua

+      self.requests_by_endpoint[endpoint] = nil
+    end
+  end
+  self.total_requests = self.total_requests - 1


Should we check if self.total_requests is greater than 0?

That should never happen according to our tests. The only point I could see is a critical panic-like log, is that what you've been thinking?
If the algorithm is wrong in some way and we've missed a bug, I'd rather let us run with negative total_requests and let something else break more loudly than if we let it quietly never decrement it.

tao12345666333 · 2023-02-03T10:09:10Z

rootfs/etc/nginx/lua/util.lua

@@ -180,4 +189,45 @@ local function replace_special_char(str, a, b)
 end
 _M.replace_special_char = replace_special_char

+local function get_hostname()
+  local f = io.popen("/bin/hostname")


Consider using the os.getenv("HOSTNAME") function to retrieve the hostname?

Is that an env variable that is commonly available in Unix? Or is that more to allow testing this easier?

ElvinEfendi · 2023-02-08T18:07:05Z

There are two legitimate test failures:

!!! 'gofmt -s' needs to be run on the following files: 
./test/e2e/annotations/upstreamhashby.go
Error: Process completed with exit code 1.

Error:   2023/02/06 13:44:34 [error] 71#71: *336 failed to run balancer_by_lua*: /etc/nginx/lua/balancer/chashboundedloads.lua:228: variable "chashbl_debug" not found for writing; maybe it is a built-in variable that is not changeable or you forgot to use "set $chashbl_debug '';" in the config file to define it first

k8s-ci-robot · 2023-07-22T12:58:09Z

PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot · 2024-08-25T20:47:51Z

@kirs: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Required	Rerun command
pre-ingress-nginx-codegen	`9eebd83`	link	true	`/test pre-ingress-nginx-codegen`
pre-ingress-nginx-lua-test	`9eebd83`	link	true	`/test pre-ingress-nginx-lua-test`
pre-ingress-nginx-boilerplate	`9eebd83`	link	true	`/test pre-ingress-nginx-boilerplate`

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

k8s-ci-robot added needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Oct 31, 2022

k8s-ci-robot requested review from strongjz and tao12345666333 October 31, 2022 21:48

k8s-ci-robot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. needs-kind Indicates a PR lacks a `kind/foo` label and requires one. labels Oct 31, 2022

k8s-ci-robot added needs-priority area/lua Issues or PRs related to lua code size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Oct 31, 2022

kirs force-pushed the chashbl branch 2 times, most recently from 279c336 to 43d9b6e Compare October 31, 2022 21:52

k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Nov 1, 2022

ElvinEfendi reviewed Nov 1, 2022

View reviewed changes

kirs force-pushed the chashbl branch 2 times, most recently from be9ed06 to ec94c71 Compare November 1, 2022 17:08

k8s-ci-robot assigned tao12345666333 Nov 1, 2022

kirs force-pushed the chashbl branch from ec94c71 to c483f83 Compare February 2, 2023 17:34

kirs force-pushed the chashbl branch 3 times, most recently from 01a538c to 56de583 Compare February 2, 2023 17:37

Implement Consistent Hashing with Bounded Loads

db5edc6

kirs force-pushed the chashbl branch from 56de583 to db5edc6 Compare February 2, 2023 17:41

tao12345666333 reviewed Feb 3, 2023

View reviewed changes

e2e test

9eebd83

kirs force-pushed the chashbl branch from bf7b4ea to 9eebd83 Compare February 6, 2023 13:25

k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 22, 2023

airhorns mentioned this pull request May 1, 2024

Add plugin hook for adding new load balancing algorithms #11336

Closed

Gacko force-pushed the main branch from 3ff42a9 to 0109163 Compare March 24, 2025 17:44

Implement Consistent Hashing with Bounded Loads #9239

Are you sure you want to change the base?

Implement Consistent Hashing with Bounded Loads #9239

Conversation

kirs commented Oct 31, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

k8s-ci-robot commented Oct 31, 2022

Uh oh!

k8s-ci-robot commented Oct 31, 2022

Uh oh!

ElvinEfendi commented Oct 31, 2022

Uh oh!

longwuyuan commented Oct 31, 2022

Uh oh!

kirs commented Nov 1, 2022

Uh oh!

longwuyuan commented Nov 1, 2022

Uh oh!

longwuyuan commented Nov 1, 2022

Uh oh!

longwuyuan commented Nov 1, 2022

Uh oh!

tao12345666333 commented Nov 1, 2022

Uh oh!

ElvinEfendi Nov 1, 2022

Choose a reason for hiding this comment

Uh oh!

ElvinEfendi Nov 1, 2022

Choose a reason for hiding this comment

Uh oh!

kirs commented Nov 1, 2022

Uh oh!

tao12345666333 commented Nov 1, 2022

Uh oh!

kirs commented Nov 14, 2022

Uh oh!

k8s-ci-robot commented Feb 2, 2023

Uh oh!

kirs commented Feb 2, 2023

Uh oh!

ElvinEfendi commented Feb 3, 2023

Uh oh!

tao12345666333 Feb 3, 2023

Choose a reason for hiding this comment

Uh oh!

kirs Feb 6, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tao12345666333 Feb 3, 2023

Choose a reason for hiding this comment

Uh oh!

kirs Feb 6, 2023

Choose a reason for hiding this comment

Uh oh!

ElvinEfendi commented Feb 8, 2023

Uh oh!

k8s-ci-robot commented Jul 22, 2023

Uh oh!

k8s-ci-robot commented Aug 25, 2024

Uh oh!

Uh oh!

kirs commented Oct 31, 2022 •

edited

Loading

kirs Feb 6, 2023 •

edited

Loading