[serve][llm] Feature: Add W&B Model Loading Callback for LLMEngine #58928

pjh4993 · 2025-11-24T03:27:26Z

Description

This PR introduces a WandBModelLoadingCallback to enable users to load models directly from Weights & Biases (W&B) Artifacts within Ray LLM/Serve deployments.

This feature allows users to specify a model source using the wandb:// scheme, resolving it to a local disk path before the engine configuration is finalized.

Motivation
Currently, Ray LLM primarily supports model sources from Hugging Face or direct local/S3 paths. Integrating W&B Artifacts is a common requirement for teams using W&B for model versioning and tracking. This change leverages the new callback feature to provide this integration point without modifying core LLM server logic.

Implementation Details

WandBArtifactHandler: A helper class is introduced to manage W&B API interaction. It handles downloading artifacts (run.use_artifact(...).download()) and supports optional custom configuration via wandb_base_url and wandb_api_key for private/custom W&B instances.
Symlinking: The handler creates a symlink from the W&B client's local cache location to the user-specified local_path. This is efficient as it avoids redundant copying if the artifact is already cached.
Configuration Model: WandBDownloaderConfig uses Pydantic validation to enforce that the necessary paths parameter is provided in the callback_kwargs. paths is expected to be a list of (wandb_uri, local_path) tuples.
Callback Logic (on_before_node_init): The callback initializes with its configuration validated by WandBDownloaderConfig. It iterates over the provided paths list. For each entry, it uses the WandBArtifactHandler to download the artifact and ensure it is available at the specified local_path.

Related issues

This feature was discussed in the Ray LLM Slack channel, where the callback approach was identified as the preferred path forward for integrating custom model sources like W&B Artifacts.

Additional information

How to Use
To use a W&B Artifact as the model source, a user must configure the LLMConfig to point the model source to the local path where the callback will download the artifact, and then pass the WandBDownloader and its configuration via callback_config.

Example:

from ray.llm._internal.common.callbacks.wandb_downloader import WandBDownloader
from ray.llm.config import LLMConfig, ModelLoadingConfig

# The target local path where the artifact will be placed.
LOCAL_MODEL_PATH = "/tmp/wandb-model-cache/my-llm"

config = LLMConfig(
    # 1. Point the LLM's model_source to the local path.
    model_loading_config=ModelLoadingConfig(
        model_source=LOCAL_MODEL_PATH
    ),
    # 2. Configure the callback to download the artifact to that local path.
    callback_config={
        "callback_class": WandBDownloader,
        "callback_kwargs": {
            "paths": [
                # (WandB URI, Local Target Path)
                ("wandb://my-org/my-project/llm-model:v1", LOCAL_MODEL_PATH),
            ],
            # Optional: for private/custom WandB instances
            "wandb_base_url": os.environ.get("WANDB_BASE_URL"),
            "wandb_api_key": os.environ.get("WANDB_API_KEY"),
        }
    }
)
# Deploy 'config' with Ray Serve

API Changes:

No breaking changes

Testing:

No additional test included

- Introduced callback to facilitate downloading artifacts from Weights & Biases (WandB) before model files are processed. - Implemented for handling artifact downloads and validation of WandB URIs. - Added for configuration validation using Pydantic. - Updated to include the Usage: wandb [OPTIONS] COMMAND [ARGS]... Options: --version Show the version and exit. --help Show this message and exit. Commands: agent Run the W&B agent artifact Commands for interacting with artifacts beta Beta versions of wandb CLI commands. controller Run the W&B local sweep controller disabled Disable W&B. docker Run your code in a docker container. docker-run Wrap `docker run` and adds WANDB_API_KEY and WANDB_DOCKER... enabled Enable W&B. init Configure a directory with Weights & Biases job Commands for managing and viewing W&B jobs launch Launch or queue a W&B Job. launch-agent Run a W&B launch agent. launch-sweep Run a W&B launch sweep (Experimental). login Login to Weights & Biases offline Disable W&B sync online Enable W&B sync pull Pull files from Weights & Biases restore Restore code, config and docker state for a run scheduler Run a W&B launch sweep scheduler (Experimental) server Commands for operating a local W&B server status Show configuration settings sweep Initialize a hyperparameter sweep. sync Upload an offline training directory to W&B verify Verify your local instance package as a dependency. This enhancement streamlines the integration of WandB artifacts into the model training workflow, improving usability and efficiency. Signed-off-by: junhoPark <[email protected]>

pjh4993 force-pushed the feat/add-wandb-downloader-callback-base branch from 3bef843 to e28dee0 Compare November 24, 2025 03:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[serve][llm] Feature: Add W&B Model Loading Callback for LLMEngine #58928

[serve][llm] Feature: Add W&B Model Loading Callback for LLMEngine #58928

pjh4993 commented Nov 24, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

[serve][llm] Feature: Add W&B Model Loading Callback for LLMEngine #58928

Are you sure you want to change the base?

[serve][llm] Feature: Add W&B Model Loading Callback for LLMEngine #58928

Conversation

pjh4993 commented Nov 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Related issues

Additional information

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

pjh4993 commented Nov 24, 2025 •

edited

Loading