Add state management and queuing

State management: 
- [ ] Keep track of which model(s) is in memory to help with advanced batching (NOT pure FIFO)
- [ ] Prioritization? 

Queuing
- [ ] Inference queue
- [ ] Advanced batching -- when the queue contains separate requests for the same model, batch them and run all jobs requesting that model before moving onto the next model (with a max of 15-20 minutes with any one model in memory, if we have other jobs waiting in the queue. This should balance efficiency, i.e. batching, with fairness, i.e. FIFO queuing).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add state management and queuing #1

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Add state management and queuing #1

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions