Skip to content

Commit ff3a722

Browse files
authored
Add HRW.Bounded (#1)
1 parent c27ba15 commit ff3a722

3 files changed

Lines changed: 94 additions & 0 deletions

File tree

README.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,8 @@ The most common library in the Elixir community to use to solve that problem is
66

77
This library also comes with HRW.Skeleton which uses a clustering mechanism to go from O(n) to O(log n), with the trade-off that you need to create the struct with `HRW.Skeleton.build` and pass to each call of `HRW.Skeleton.owner`.
88

9+
Additionally, there's `HRW.Bounded` for when you want to control the distribution of keys across nodes to limit skew. Consistent hashing and rendezvous hashing algorithms can easily result in uneven distribution for smaller node counts, and `HRW.Bounded` lets you control that, assuming that you have the whole key set up front.
10+
911
```elixir
1012
# HRW
1113
HRW.owner("192.168.0.1", ["server1", "server2", "server3"])
@@ -20,6 +22,10 @@ skeleton = HRW.Skeleton.build(["server1", "server2", "server3"])
2022

2123
HRW.Skeleton.owner("192.168.0.2", skeleton)
2224
#=> "server3"
25+
26+
# HRW.Bounded
27+
HRW.Bounded.assignments(["a", "b", "c", "d"], ["x", "y"], epsilon: 0.0)
28+
#=> %{"a" => "x", "b" => "x", "c" => "y", "d" => "y"}
2329
```
2430

2531
## Benchmarks

lib/hrw/bounded.ex

Lines changed: 71 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,71 @@
1+
defmodule HRW.Bounded do
2+
@moduledoc """
3+
A bounded-load variant of HRW. Distributes a known set of keys across nodes
4+
such that no node receives more than `ceil(|keys| / |nodes| × (1 + epsilon))`
5+
keys.
6+
7+
Pure function of `(keys, nodes, opts)` — any two callers with the same
8+
inputs produce the same assignment, no coordination required. Use this
9+
when the full key set is known at compute time and you want bounded skew.
10+
"""
11+
12+
@doc """
13+
Returns a map of `key => node` covering every input key. Returns `%{}`
14+
when `keys` is empty.
15+
16+
Each key is assigned to its highest-scoring node, with overflow falling
17+
through to the next-best when a node hits the cap of
18+
`ceil(|keys| / |nodes| × (1 + epsilon))`.
19+
20+
## Options
21+
22+
* `:epsilon` - load slack factor. Smaller values give tighter balance but
23+
more movement on node churn. Defaults to `0.25`.
24+
* `:hash_fn` - a function `term -> integer`. Defaults to `&:erlang.phash2/1`.
25+
26+
## Examples
27+
28+
iex> HRW.Bounded.assignments(["a", "b", "c", "d"], ["x", "y"], epsilon: 0.0)
29+
%{"a" => "x", "b" => "x", "c" => "y", "d" => "y"}
30+
"""
31+
@spec assignments([term()], [term()], keyword()) :: %{term() => term()}
32+
def assignments(keys, nodes, opts \\ [])
33+
34+
def assignments(_keys, [], _opts) do
35+
raise ArgumentError, "HRW.Bounded.assignments/3 requires a non-empty list of nodes"
36+
end
37+
38+
def assignments(keys, nodes, opts) do
39+
hash_fn = Keyword.get(opts, :hash_fn, &:erlang.phash2/1)
40+
epsilon = Keyword.get(opts, :epsilon, 0.25)
41+
42+
if epsilon < 0 do
43+
raise ArgumentError,
44+
"HRW.Bounded.assignments/3 requires :epsilon >= 0, got: #{inspect(epsilon)}"
45+
end
46+
47+
keys =
48+
keys
49+
|> Enum.uniq()
50+
|> Enum.sort()
51+
52+
nodes =
53+
nodes
54+
|> Enum.uniq()
55+
|> Enum.sort()
56+
57+
cap = ceil(length(keys) / length(nodes) * (1 + epsilon))
58+
59+
{results, _} =
60+
Enum.reduce(keys, {%{}, %{}}, fn k, {out, load} ->
61+
node =
62+
nodes
63+
|> Enum.filter(fn n -> Map.get(load, n, 0) < cap end)
64+
|> Enum.max_by(fn n -> hash_fn.({k, n}) end)
65+
66+
{Map.put(out, k, node), Map.update(load, node, 1, &(&1 + 1))}
67+
end)
68+
69+
results
70+
end
71+
end

test/hrw/bounded_test.exs

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
defmodule HRW.BoundedTest do
2+
use ExUnit.Case, async: true
3+
doctest HRW.Bounded
4+
5+
test "assignments respect the cap, giving exact balance with epsilon: 0.0" do
6+
keys = ["a", "b", "c", "d"]
7+
nodes = ["x", "y"]
8+
9+
counts =
10+
keys
11+
|> HRW.Bounded.assignments(nodes, epsilon: 0.0)
12+
|> Map.values()
13+
|> Enum.frequencies()
14+
15+
assert counts == %{"x" => 2, "y" => 2}
16+
end
17+
end

0 commit comments

Comments
 (0)