Scaling JupyterHub beyond one replica

### Problem Statement

As a JupyterHub administrator with thousands of users, I want to be able to serve my users with JupyterHub without slowing down or crashing. JupyterHub's current architecture prohibits running more than one `Hub` for a given set of users, so to serve a large number of users I must split them across independent Hubs. This adds to the complexity of configurating and operating my deployment.


### Proposed Solution

Update the JupyterHub architecture to allow for multiple concurrent JupyterHub instances, allowing multiple JupyterHub replicas to share the load. This will increase the number of users a single JupyterHub deployment can reasonably support, and supporting multiple replicas mitigates the impact of slow operations blocking one hub.

### Proposed Implementation

This is a substantial undertaking. The first pass is to make the db session lifetime per-request (like most normal webapps!). That will mean removing all long-lived ORM objects (mainly: User in the `user_dict` and orm.Spawner), in favor of methods and functions that take a current session as an argument. The second step to actually allow multiple Hubs is to deal with 'ownership' of running Spawners for the purposes spawning/polling.

### How will this fit in the ecosystem?

It is likely that this will have breaking consequences for Spawners, as the ORM objects Spawners access will need to change / go away. Most basic Spawners should be unaffected, but anything accessing the underling `orm_spawner` and/or `spawner.user` will likely need updating. It may also create a new area of development in 'spawn pools' for running outside the Hub.

### Endorsements

- @minrk https://github.com/jupyterhub/jupyterhub/issues/1932


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Scaling JupyterHub beyond one replica #7

Problem Statement

Proposed Solution

Proposed Implementation

How will this fit in the ecosystem?

Endorsements

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Scaling JupyterHub beyond one replica #7

Description

Problem Statement

Proposed Solution

Proposed Implementation

How will this fit in the ecosystem?

Endorsements

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions