-
Notifications
You must be signed in to change notification settings - Fork 7.3k
Open
Labels
bugSomething that is supposed to be working; but isn'tSomething that is supposed to be working; but isn'tcommunity-backlogdocsAn issue or change related to documentationAn issue or change related to documentationquestionJust a question :)Just a question :)serveRay Serve Related IssueRay Serve Related IssuestabilitytriageNeeds triage (eg: priority, bug/not-bug, and owning component)Needs triage (eg: priority, bug/not-bug, and owning component)
Description
What happened + What you expected to happen
I saw mentioned in the multiplexed models example at doc/source/serve/tutorials/model_multiplexing_forecast/content/README.md the _prewarm method as a way to pre-load multiplexed models. I tried to implement it in a simple example but it seems to have no effect.
Is there an actual way to pre-load models in a multiplexed setting? Also, how would it work with multiple replicas? Would it be possible to pre-load different models in multiple replicas?
Thanks a lot!
Versions / Dependencies
ray[serve] 2.54.0
Ubuntu 24.04 LTS
Reproduction script
Please find a full reproducible example attached. Simply run make run-docker-all from the root.
Issue Severity
Medium: It is a significant difficulty but I can work around it.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
bugSomething that is supposed to be working; but isn'tSomething that is supposed to be working; but isn'tcommunity-backlogdocsAn issue or change related to documentationAn issue or change related to documentationquestionJust a question :)Just a question :)serveRay Serve Related IssueRay Serve Related IssuestabilitytriageNeeds triage (eg: priority, bug/not-bug, and owning component)Needs triage (eg: priority, bug/not-bug, and owning component)
Type
Projects
Status
Todo