Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Llm large reference #1951

Merged
merged 14 commits into from
Dec 3, 2024
Merged

Llm large reference #1951

merged 14 commits into from
Dec 3, 2024

Conversation

pgmpablo157321
Copy link
Contributor

No description provided.

@pgmpablo157321 pgmpablo157321 requested a review from a team as a code owner December 2, 2024 15:46
Copy link
Contributor

github-actions bot commented Dec 2, 2024

MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅

vllm==0.6.3
pybind11==2.10.4
--extra-index-url https://download.pytorch.org/whl/nightly/cpu
torch==2.2.0.dev20231006+cpu
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This torch version doesn't exist

@nvzhihanj
Copy link
Contributor

WARNING 11-28 01:43:11 multiproc_gpu_executor.py:53] Reducing Torch parallelism from 112 threads to 1 to avoid unnecessary CPU contention. Set OMP_NUM_THREADS in the external environment to tune this value as needed.
INFO 11-28 01:43:11 custom_cache_manager.py:17] Setting Triton cache manager to: vllm.triton_utils.custom_cache_manager:CustomCacheManager
(VllmWorkerProcess pid=899) INFO 11-28 01:43:11 multiproc_worker_utils.py:216] Worker ready; awaiting tasks
(VllmWorkerProcess pid=899) ERROR 11-28 01:43:11 multiproc_worker_utils.py:231] Exception in worker VllmWorkerProcess while processing method init_device: Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' start method, Traceback (most recent call last):
(VllmWorkerProcess pid=899) ERROR 11-28 01:43:11 multiproc_worker_utils.py:231]   File "/home/zhihanj/.local/lib/python3.10/site-packages/vllm/executor/multiproc_worker_utils.py", line 224, in _run_worker_process
(VllmWorkerProcess pid=899) ERROR 11-28 01:43:11 multiproc_worker_utils.py:231]     output = executor(*args, **kwargs)
(VllmWorkerProcess pid=899) ERROR 11-28 01:43:11 multiproc_worker_utils.py:231]   File "/home/zhihanj/.local/lib/python3.10/site-packages/vllm/worker/worker.py", line 166, in init_device
(VllmWorkerProcess pid=899) ERROR 11-28 01:43:11 multiproc_worker_utils.py:231]     torch.cuda.set_device(self.device)
(VllmWorkerProcess pid=899) ERROR 11-28 01:43:11 multiproc_worker_utils.py:231]   File "/home/zhihanj/.local/lib/python3.10/site-packages/torch/cuda/__init__.py", line 420, in set_device
(VllmWorkerProcess pid=899) ERROR 11-28 01:43:11 multiproc_worker_utils.py:231]     torch._C._cuda_setDevice(device)
(VllmWorkerProcess pid=899) ERROR 11-28 01:43:11 multiproc_worker_utils.py:231]   File "/home/zhihanj/.local/lib/python3.10/site-packages/torch/cuda/__init__.py", line 300, in _lazy_init
(VllmWorkerProcess pid=899) ERROR 11-28 01:43:11 multiproc_worker_utils.py:231]     raise RuntimeError(
(VllmWorkerProcess pid=899) ERROR 11-28 01:43:11 multiproc_worker_utils.py:231] RuntimeError: Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' start method
(VllmWorkerProcess pid=899) ERROR 11-28 01:43:11 multiproc_worker_utils.py:231] 

```
export CHECKPOINT_PATH=${PWD}/Llama-2-70b-chat-hf
git lfs install
git clone https://huggingface.co/meta-llama/Llama-3.1-405B-Instruct ${CHECKPOINT_PATH}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add the commit as of today (there has been no update since 9/25) so we are fixed on one version of the checkpoint

@mrmhodak mrmhodak merged commit 45544f3 into master Dec 3, 2024
11 checks passed
@github-actions github-actions bot locked and limited conversation to collaborators Dec 3, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants