Skip to content

Shard-aware load balancing #129

@nyh

Description

@nyh

After we implement #11, to route every single-partition request to a good node (one of the three replicas holding a copy of the the partition, possibly one on the right rack or the same replica to reduce LWT contention), the next step requested by this issue is to route the request to a good shard on that node.

We have such a feature in CQL - enable_shard_aware_drivers. It creates a shard-aware port: Client connections to this port are routed to a specific shard according to a formula based on the client-side port numbers. Clients open many sockets to the same node instead of just one socket to a node, and each of these sockets ends up connected to a different shard. The client calculates which shard it wants to send the request to, and uses the appropriate socket to reach that shard directly. The shard is selected such that it owns the tablet with the requested token (we can also do this with vnodes, but we probably don't care about vnode support any more).

The connection-per-shard approach suggested here will probably require some hacking on the client SDKs. The HTTP library used by these SDKs will normally open only one socket between the client and some remote node X, or more if more concurrent requests come in, but we want it to deliberately open many connections and pick one of them according to a formula based on the client-side port, and then keep all of these connections open for the next request.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions