Skip to content

Lazily load models and on insufficient resources for a load, look to unload idle models  #1403

Open
@kimjuny

Description

@kimjuny

Feature Request

Describe the problem the feature is intended to solve

I have thousands of models to be served, but quite a big part of these models are not frequently requested, actually only a few of them are. Loading all of these models into memory at the same time is quite resource-consuming & time-consuming.

Describe the solution

I wonder if there's an option to lazy load models and only caching those most frequently requested models in memory?

Describe alternatives you've considered

None yet.

Additional context

Actually I'm not sure if this is a feature request or we are already having this feature.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions