Skip to content

Coordinator-only nodes (server-side load balancing) #107

Description

@dkropachev

Jira task: https://scylladb.atlassian.net/browse/DRIVER-677
Jira epic: https://scylladb.atlassian.net/browse/DRIVER-29

Copied from Jira epic DRIVER-29:

The original Alternator Load Balancing design document (https://docs.google.com/document/d/1twgrs6IM1B10BswMBUNqm7bwu5HCm47LOYE-Hdhuu_8/edit) suggested also the direction of server-side load balancing for unmodified clients. One of the implementations suggested was frontend (coordinator-only) nodes. Basically the idea is to have a regular Alternator node without data (owning zero tokens) serve as a load balancer to the other nodes that contain the data.

The purpose of this issue is to start testing and documenting this approach. I want to do the following:

  1. Write a test, perhaps in scylla.git's test/topology_custom (not here), that zero-token nodes actually work. We want to check it works for both tablets and vnodes-based tables, because currently Alternator defaults to using vnodes (but we can easily create an Alternator table with tablets for testing).
  2. Perform a very rudimentary performance test. The goal of this test isn't to measure absolute performance (so it won't need any nice beefy VMs) but just to measure relative performance between coordinator-only and data-holding nodes: if a data-holding Alternator nodes can finish N requests per second, we want the coordinator to be able to handle significantly more than N requests per second, otherwise it won't be useful as a load balancer (or, if we take larger nodes, this will be a very expensive solution)... In other words if coordinator is added to a K-node cluster (K=3,4,...), can we load the coordinator to 100% CPU (and leaving the cluster partially unused), or is the coordinator able to drive enough requests until the cluster is 100% used?
  3. Document in this repository how to set up the coordinator-only node and lessons we learned about it above. Start also thinking what we can do for high-availability (the coordinator node suddenly failing). Virtual IP address? DNS?

The relevant Scylla feature, coordinator-only nodes a.k.a. zero-token nodes or proxy nodes, was requested in scylladb/scylladb#6527 and implemented in scylladb/scylladb#19684.

CC @swasik

Migrated from GitHub issue: scylladb/alternator-load-balancing#51

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions