Skip to content

Unify ray serve#1931

Open
lbluque wants to merge 37 commits intomainfrom
unify-ray-serve
Open

Unify ray serve#1931
lbluque wants to merge 37 commits intomainfrom
unify-ray-serve

Conversation

@lbluque
Copy link
Copy Markdown
Contributor

@lbluque lbluque commented Mar 26, 2026

Summary

  • Unified batch serving around BatchPredictServer + BatchServerPredictUnit, removing the parallel FAIRChemInferenceServer / FAIRChemInferenceClient / RayServeMLIPUnit architecture (~1,100 lines deleted across 3 files)
  • Added MultiplexedBatchPredictServer — a subclass of BatchPredictServer that uses @serve.multiplexed for on-demand model loading with LRU eviction, preserving the multi-model capability in a cleaner form
  • Added BatchServerPredictUnit.from_deployment_connection_info() classmethod to connect to already-running Ray Serve deployments by name, with optional multiplexed_model_id for multiplexed deployments — replacing the need for RayServeMLIPUnit
  • Moved wait_for_serve_ready() and get_ray_connection_info() into batch_predict_server.py as shared utilities
  • Extracted _init_ray_and_serve() and _build_deployment_options() helpers to share setup logic between setup_batch_predict_server() and the new setup_multiplexed_batch_predict_server()
  • Updated get_slurm_ray_cluster and get_local_ray_cluster to accept a predict_unit parameter and use setup_batch_predict_server() instead of the deleted start_serve()

What's removed

  • FAIRChemInferenceServer / FAIRChemInferenceClient / RayServeMLIPUnit: 3 files deleted entirely; their functionality is replaced by MultiplexedBatchPredictServer + BatchServerPredictUnit.from_deployment_connection_info()
  • Metadata serialization layer (RayServeTask, _cache_model_metadata, fetch_model_metadata): replaced by get_predict_unit_attribute which returns real objects via Ray serialization
  • SimpleNamespace inference_settings stub: replaced by fetching real InferenceSettings from the server

Test plan

  • pytest tests/core/calculate/test_batcher.py -c packages/fairchem-core/pyproject.toml -vv
  • pytest tests/core/units/mlip_unit/test_predict.py -c packages/fairchem-core/pyproject.toml -vv
  • pytest tests/core/units/mlip_unit/test_inference_serve.py -c packages/fairchem-core/pyproject.toml -vv (requires GPU — covers both single-model and multiplexed server tests)

@meta-cla meta-cla Bot added the cla signed label Mar 26, 2026
@lbluque lbluque requested a review from zulissimeta March 26, 2026 23:33
@lbluque lbluque marked this pull request as draft March 26, 2026 23:38
@lbluque lbluque added minor Minor version release enhancement New feature or request labels Mar 30, 2026
@lbluque lbluque requested a review from rayg1234 March 30, 2026 22:44
@lbluque lbluque marked this pull request as ready for review April 9, 2026 22:11
@lbluque lbluque changed the base branch from rayserve_calculator to main April 18, 2026 01:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla signed enhancement New feature or request minor Minor version release

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants