Llm large reference #1951

pgmpablo157321 · 2024-12-02T15:46:30Z

No description provided.

github-actions · 2024-12-02T15:46:44Z

MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅

nvzhihanj · 2024-12-02T18:37:30Z

language/llama3-405b/requirements.txt

+vllm==0.6.3
+pybind11==2.10.4
+--extra-index-url https://download.pytorch.org/whl/nightly/cpu
+torch==2.2.0.dev20231006+cpu


This torch version doesn't exist

nvzhihanj · 2024-12-02T18:38:29Z

WARNING 11-28 01:43:11 multiproc_gpu_executor.py:53] Reducing Torch parallelism from 112 threads to 1 to avoid unnecessary CPU contention. Set OMP_NUM_THREADS in the external environment to tune this value as needed.
INFO 11-28 01:43:11 custom_cache_manager.py:17] Setting Triton cache manager to: vllm.triton_utils.custom_cache_manager:CustomCacheManager
(VllmWorkerProcess pid=899) INFO 11-28 01:43:11 multiproc_worker_utils.py:216] Worker ready; awaiting tasks
(VllmWorkerProcess pid=899) ERROR 11-28 01:43:11 multiproc_worker_utils.py:231] Exception in worker VllmWorkerProcess while processing method init_device: Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' start method, Traceback (most recent call last):
(VllmWorkerProcess pid=899) ERROR 11-28 01:43:11 multiproc_worker_utils.py:231]   File "/home/zhihanj/.local/lib/python3.10/site-packages/vllm/executor/multiproc_worker_utils.py", line 224, in _run_worker_process
(VllmWorkerProcess pid=899) ERROR 11-28 01:43:11 multiproc_worker_utils.py:231]     output = executor(*args, **kwargs)
(VllmWorkerProcess pid=899) ERROR 11-28 01:43:11 multiproc_worker_utils.py:231]   File "/home/zhihanj/.local/lib/python3.10/site-packages/vllm/worker/worker.py", line 166, in init_device
(VllmWorkerProcess pid=899) ERROR 11-28 01:43:11 multiproc_worker_utils.py:231]     torch.cuda.set_device(self.device)
(VllmWorkerProcess pid=899) ERROR 11-28 01:43:11 multiproc_worker_utils.py:231]   File "/home/zhihanj/.local/lib/python3.10/site-packages/torch/cuda/__init__.py", line 420, in set_device
(VllmWorkerProcess pid=899) ERROR 11-28 01:43:11 multiproc_worker_utils.py:231]     torch._C._cuda_setDevice(device)
(VllmWorkerProcess pid=899) ERROR 11-28 01:43:11 multiproc_worker_utils.py:231]   File "/home/zhihanj/.local/lib/python3.10/site-packages/torch/cuda/__init__.py", line 300, in _lazy_init
(VllmWorkerProcess pid=899) ERROR 11-28 01:43:11 multiproc_worker_utils.py:231]     raise RuntimeError(
(VllmWorkerProcess pid=899) ERROR 11-28 01:43:11 multiproc_worker_utils.py:231] RuntimeError: Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' start method
(VllmWorkerProcess pid=899) ERROR 11-28 01:43:11 multiproc_worker_utils.py:231]

nvzhihanj · 2024-12-03T00:01:47Z

language/llama3-405b/README.md

+```
+export CHECKPOINT_PATH=${PWD}/Llama-2-70b-chat-hf
+git lfs install
+git clone https://huggingface.co/meta-llama/Llama-3.1-405B-Instruct ${CHECKPOINT_PATH}


Please add the commit as of today (there has been no update since 9/25) so we are fixed on one version of the checkpoint

pgmpablo157321 and others added 13 commits November 27, 2024 12:24

Initial codebase llama3-405b reference implementation

78c8ddd

Add VLLM backend

599c2ad

Prune llama2 files & Update commands

039ef79

Update evaluate accuracy script

574e3b8

Fix minor issues

c41b0d2

Add Llama3 configuration

89590c9

[Automated Commit] Format Codebase

18dd841

Set tensor_parallel_size to 8

dc7476d

Add requirements.txt

6cf7d54

Remove consolidate_results.py

f4123b9

Host dataset + add instructions to download it

84f99d6

Merge 84f99d6 into d28a530

29a2319

[Automated Commit] Format Codebase

2800fda

pgmpablo157321 requested a review from a team as a code owner December 2, 2024 15:46

nvzhihanj reviewed Dec 2, 2024

View reviewed changes

nvzhihanj reviewed Dec 3, 2024

View reviewed changes

Fix dockerfile + update README

7a572e7

mrmhodak approved these changes Dec 3, 2024

View reviewed changes

mrmhodak merged commit 45544f3 into master Dec 3, 2024
11 checks passed

github-actions bot locked and limited conversation to collaborators Dec 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Llm large reference #1951

Llm large reference #1951

pgmpablo157321 commented Dec 2, 2024

github-actions bot commented Dec 2, 2024 •

edited

Loading

nvzhihanj Dec 2, 2024

nvzhihanj commented Dec 2, 2024

nvzhihanj Dec 3, 2024

Llm large reference #1951

Llm large reference #1951

Conversation

pgmpablo157321 commented Dec 2, 2024

github-actions bot commented Dec 2, 2024 • edited Loading

nvzhihanj Dec 2, 2024

Choose a reason for hiding this comment

nvzhihanj commented Dec 2, 2024

nvzhihanj Dec 3, 2024

Choose a reason for hiding this comment

github-actions bot commented Dec 2, 2024 •

edited

Loading