-
Notifications
You must be signed in to change notification settings - Fork 65
bug: the new image doesn't have the accelerate command in the PATH #666
Copy link
Copy link
Open
Description
Describe the bug
The image built from main doesn't have accelerate command available in the PATH
imageID: 'quay.io/foundation-model-stack/fms-hf-tuning@sha256:4f677383d504502fa73d5fe3b62048189211cf4beaaf8d56ca26045228c33ac8'
image: 'quay.io/foundation-model-stack/fms-hf-tuning:main-nvcr-latest'1000800000@example:~$ echo $SHELL
/bin/bash
1000800000@example:~$ accelerate
bash: accelerate: command not foundPlatform
Please provide details about the environment you are using, including the following:
- Interpreter version: Python 3.12.3 (main, Jan 17 2025, 18:03:48) [GCC 13.3.0] on linux
- Library version: latest main
1000800000@example:~/fms-hf-tuning$ pwd
/app/fms-hf-tuning
1000800000@example:~/fms-hf-tuning$ git log
commit 3c27af0e6886485838e914a0813828316c8d9b8c (grafted, HEAD -> main, origin/main)
Author: Dushyant Behl <dushyantbehl@users.noreply.github.com>
Date: Wed Mar 4 21:05:14 2026 +0530
Add app folder in nvcr image to mimic dockerfile (#665)
Signed-off-by: Dushyant Behl <dushyantbehl@in.ibm.com>Sample Code
You can bring up the container locally with docker/podman.
Here's a pod yaml if you want to run in a K8s/Openshift cluster
apiVersion: v1
kind: Pod
metadata:
name: example
spec:
containers:
- name: mycontainer
image: 'quay.io/foundation-model-stack/fms-hf-tuning:main-nvcr-latest'
command:
- bash
- '-c'
- |
echo 'sleeping...'
tail -f /dev/nullGo inside the container and try to run accelerate
oc exec -it <podname> -- bashExpected behavior
The accelerate command should be available in the PATH
Observed behavior
$ accelerate
bash: accelerate: command not foundAdditional context
$ docker run --rm -it --entrypoint bash quay.io/foundation-model-stack/fms-hf-tuning:main-nvcr-latest
root@af4efe7bc238:~# accelerate
bash: accelerate: command not found
root@af4efe7bc238:~# pwd
/app
root@af4efe7bc238:~# ls -la
total 20
drwxrwxr-x. 1 root root 42 Mar 4 16:55 .
drwxr-xr-x. 1 root root 10 Mar 5 07:55 ..
-rw-r--r--. 1 root root 3247 Mar 4 15:35 accelerate_fsdp_defaults.yaml
-rwxr-xr-x. 1 root root 8294 Mar 4 15:35 accelerate_launch.py
drwxr-xr-x. 2 root root 22 Mar 4 16:55 build
drwxr-xr-x. 13 root root 4096 Mar 4 15:44 fms-hf-tuning
root@af4efe7bc238:~# echo $PATH
/usr/local/lib/python3.12/dist-packages/torch_tensorrt/bin:/usr/local/mpi/bin:/usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/local/ucx/bin:/opt/amazon/efa/bin:/opt/tensorrt/bin
root@af4efe7bc238:~# echo $VIRTUAL_ENV
root@af4efe7bc238:~# echo $CONDA_PREFIX
root@af4efe7bc238:~# python3
Python 3.12.3 (main, Jan 17 2025, 18:03:48) [GCC 13.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>>
root@af4efe7bc238:~# python3 --version
Python 3.12.3
root@af4efe7bc238:~# python
Python 3.12.3 (main, Jan 17 2025, 18:03:48) [GCC 13.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>>
root@af4efe7bc238:~# python --version
Python 3.12.3
root@af4efe7bc238:~# which python
/usr/bin/python
root@af4efe7bc238:~# which python3
/usr/bin/python3
root@af4efe7bc238:~# python3
Python 3.12.3 (main, Jan 17 2025, 18:03:48) [GCC 13.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> sys.path
['', '/usr/lib/python312.zip', '/usr/lib/python3.12', '/usr/lib/python3.12/lib-dynload', '/usr/local/lib/python3.12/dist-packages', '/usr/local/lib/python3.12/dist-packages/nvfuser-0.2.25a0+6627725-py3.12-linux-x86_64.egg', '/usr/local/lib/python3.12/dist-packages/lightning_thunder-0.2.0.dev0-py3.12.egg', '/usr/local/lib/python3.12/dist-packages/opt_einsum-3.4.0-py3.12.egg', '/usr/local/lib/python3.12/dist-packages/dill-0.3.9-py3.12.egg', '/usr/local/lib/python3.12/dist-packages/lightning_utilities-0.12.0-py3.12.egg', '/usr/local/lib/python3.12/dist-packages/looseversion-1.3.0-py3.12.egg', '/usr/local/lib/python3.12/dist-packages/sympy-1.13.1-py3.12.egg', '/usr/lib/python3/dist-packages']Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels