-
Notifications
You must be signed in to change notification settings - Fork 9
Open
Description
root@psap-h200-au-syd-3:/workspace/auto-tuning-vllm# auto-tune-vllm optimize --config examples/kimi-k2.yaml --python-executable python3 --max-concurrent 1
Starting auto-tune-vllm optimization
Configuration: examples/kimi-k2.yaml
Backend: ray
Python executable: python3
/usr/local/lib/python3.12/dist-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you.
import pynvml # type: ignore[import]
Generated study name: kimi-k2-code_13948 from prefix: kimi-k2-code
Loaded study: kimi-k2-code_13948
__ Saved study config to: optuna_studies/kimi-k2-code_13948/study_config.yaml
2025-10-09 05:45:48,623 INFO worker.py:1771 -- Connecting to existing Ray cluster at address: 10.245.128.4:6379...
^C[2025-10-09 05:45:53,633 W 141339 141339] gcs_rpc_client.h:155: Failed to connect to GCS at address 10.245.128.4:6379 within 5 seconds.
[2025-10-09 05:46:23,635 W 141339 141339] gcs_client.cc:184: Failed to get cluster ID from GCS server: TimedOut: Timed out while waiting for GCS to become available.
A more elegant check of ray cluster availability needs to be done. This does not respond to user input until it finishes the chain of timeouts
Metadata
Metadata
Assignees
Labels
No labels