[tinker] SkyRLTrainBackend tinker multi-tenancy improvements

1. Improve `delete_model` semantics for multi-tenancy - in the case that we are doing multi-tenant lora training, we should avoid unnecessary ray.shutdown() calls even when the number of active tenants goes to zero. However we should preserve the ray.shutdown behavior for the full parameter finetuning case.
2. Ensure that the case of an incoming request hitting an old proxy url in #1638 is not possible. We have to have some way of upserting new proxy urls for the full parameter fine tuning case, but we should make sure that updating logic is clean and we can never hit an old proxy url.
3. For the multi-tenant case if we can avoid unnecessary shutdowns, then we should only every have one proxy url upsert.

per discussion cc: @pcmoritz 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[tinker] SkyRLTrainBackend tinker multi-tenancy improvements #1654

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[tinker] SkyRLTrainBackend tinker multi-tenancy improvements #1654

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions