-
Notifications
You must be signed in to change notification settings - Fork 8.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement Consistent Hashing with Bounded Loads #9239
base: main
Are you sure you want to change the base?
Conversation
@kirs: This issue is currently awaiting triage. If Ingress contributors determines this is a relevant issue, they will accept it by applying the The Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Hi @kirs. Thanks for your PR. I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
279c336
to
43d9b6e
Compare
@rikatz @strongjz @tao12345666333 I've already reviewed this in detail in our fork. I'll let you decide if you want to merge this or when do you merge this. But I approve it. |
@ElvinEfendi @kirs , is it possible to see the numbers like requests received and routed etc on a kind cluster on a laptop with 16GB RAM 4-6cores, by cloning the fork/branch by @kirs |
@longwuyuan we can, but I'm curious what kind of data you're expecting to see there? In general, this one is pretty specific in terms of packing tenants to be served by same endpoints. We shouldn't expect it to be comparable with EWMA or round-robin because it optimized for different things. |
@kirs just wanted to put prometheus-operator and watch requests coming and routed to same pod, in a multiple replica workload, under load |
/ok-to-test |
@kirs , also just checked the commits, am not a developer, so explicitly asking if this occurs quietly for all use-cases or is there a toggle/other switch/aannotation etc. to enable/disable this useful feeature |
/retest |
|
||
local util = require("util") | ||
local split = require("util.split") | ||
local reverse_table = util.reverse_table |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to also include reverse_table
implementation in this PR.
local ngx_ERR = ngx.ERR | ||
local ngx_WARN = ngx.WARN | ||
local ngx_log = ngx.log | ||
local tostring = tostring |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is unused.
@longwuyuan this feature is opt-in. It is controlled like the rest of load balancers through annotations like:
|
be9ed06
to
ec94c71
Compare
I did a quick review and it looks good. I will review it in detail as soon as possible, especially the test cases /assign |
@tao12345666333 let me know if you have any further questions about tests or implementation 👂 |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: kirs The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
01a538c
to
56de583
Compare
I updated the implementation with the latest version that we've been running in production. This one includes the seeding logic. |
@kirs it'd also be useful to add an e2e test case to
|
self.requests_by_endpoint[endpoint] = nil | ||
end | ||
end | ||
self.total_requests = self.total_requests - 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we check if self.total_requests
is greater than 0?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That should never happen according to our tests. The only point I could see is a critical panic-like log, is that what you've been thinking?
If the algorithm is wrong in some way and we've missed a bug, I'd rather let us run with negative total_requests
and let something else break more loudly than if we let it quietly never decrement it.
@@ -180,4 +189,45 @@ local function replace_special_char(str, a, b) | |||
end | |||
_M.replace_special_char = replace_special_char | |||
|
|||
local function get_hostname() | |||
local f = io.popen("/bin/hostname") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider using the os.getenv("HOSTNAME")
function to retrieve the hostname?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is that an env variable that is commonly available in Unix? Or is that more to allow testing this easier?
There are two legitimate test failures:
|
PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
@kirs: The following tests failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
This PR implements Consistent Hashing with Bounded Loads algorithm outlined in the paper. The same algorithm is employed by Haproxy and Envoy which made it easier for me to follow an existing implementation and port it into ingress-nginx. This chains nicely to
chash
andchashsubset
balancers that are available in ingress-nginx.What's the goal?
The goal here is to keep routing requests for the same tenant to same endpoints - as long as that doesn't overload that pod, and if it does, spill over to neighbour pods in the ring.
This is relevant specifically for multi-tenant web apps where each request is annotated with
X-Account-Id
or whatever and you can leverage better caching in the upstream if same container ends up serving the same account most of the time.How does it work?
For the specified
hash_balance_factor
, requests to any upstream host are capped athash_balance_factor
times the average number of requests across the cluster. When a request arrives for an upstream host that is currently serving at its max capacity, linear probing is used to identify an eligible host in the ring. For example, withhash_balance_factor=2
, each host will be allowed to burst 2x above the average load in the cluster. If the first pick of consistent hash in the ring is loaded above that threshold, it will move on to the next node in the ring.How do you track the load?
Same as Envoy and Haproxy, we keep track of active requests in total and per each node. That allows us to do the math according to the strategy from above and to figure if the host is overloaded or not.
How do you know it works?
This approach has been tested at Shopify for the past month and it's been showing good data. This is a backport from our private fork that we'd like to contribute back to upstream.
@ElvinEfendi