Skip to content

Add state management and queuing #1

@KastanDay

Description

@KastanDay

State management:

  • Keep track of which model(s) is in memory to help with advanced batching (NOT pure FIFO)
  • Prioritization?

Queuing

  • Inference queue
  • Advanced batching -- when the queue contains separate requests for the same model, batch them and run all jobs requesting that model before moving onto the next model (with a max of 15-20 minutes with any one model in memory, if we have other jobs waiting in the queue. This should balance efficiency, i.e. batching, with fairness, i.e. FIFO queuing).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions