Description
Hi, after installing carefree-creator in docker, it reports an error when running, how to solve it?
WARNING: CUDA Minor Version Compatibility mode ENABLED.
Using driver version 510.108.03 which has support for CUDA 11.6. This container
was built with CUDA 11.8 and will be run in Minor Version Compatibility mode.
CUDA Forward Compatibility is preferred over Minor Version Compatibility for use
with this container but was unavailable:
[[Forward compatibility was attempted on non supported HW (CUDA_ERROR_COMPAT_NOT_SUPPORTED_ON_DEVICE) cuInit()=804]]
See https://docs.nvidia.com/deploy/cuda-compatibility/ for details.
NOTE: The SHMEM allocation limit is set to the default of 64MB. This may be
insufficient for PyTorch. NVIDIA recommends the use of the following flags:
docker run --gpus all --ipc=host --ulimit memlock=-1 --ulimit stack=67108864 ...
Traceback (most recent call last):
File "/opt/conda/bin/cfcreator", line 5, in
from cfcreator.cli import main
File "/opt/conda/lib/python3.8/site-packages/cfcreator/init.py", line 2, in
from .common import *
File "/opt/conda/lib/python3.8/site-packages/cfcreator/common.py", line 22, in
from cflearn.zoo import DLZoo
File "/opt/conda/lib/python3.8/site-packages/cflearn/init.py", line 3, in
from .schema import *
File "/opt/conda/lib/python3.8/site-packages/cflearn/schema.py", line 27, in
from accelerate import Accelerator
File "/opt/conda/lib/python3.8/site-packages/accelerate/init.py", line 3, in
from .accelerator import Accelerator
File "/opt/conda/lib/python3.8/site-packages/accelerate/accelerator.py", line 34, in
from .checkpointing import load_accelerator_state, load_custom_state, save_accelerator_state, save_custom_state
File "/opt/conda/lib/python3.8/site-packages/accelerate/checkpointing.py", line 24, in
from .utils import (
File "/opt/conda/lib/python3.8/site-packages/accelerate/utils/init.py", line 112, in
from .launch import (
File "/opt/conda/lib/python3.8/site-packages/accelerate/utils/launch.py", line 27, in
from ..utils.other import merge_dicts
File "/opt/conda/lib/python3.8/site-packages/accelerate/utils/other.py", line 24, in
from .transformer_engine import convert_model
File "/opt/conda/lib/python3.8/site-packages/accelerate/utils/transformer_engine.py", line 21, in
import transformer_engine.pytorch as te
File "/opt/conda/lib/python3.8/site-packages/transformer_engine/init.py", line 7, in
from . import pytorch
File "/opt/conda/lib/python3.8/site-packages/transformer_engine/pytorch/init.py", line 6, in
from .module import LayerNormLinear
File "/opt/conda/lib/python3.8/site-packages/transformer_engine/pytorch/module.py", line 16, in
import transformer_engine_extensions as tex
ImportError: /opt/conda/lib/python3.8/site-packages/transformer_engine_extensions.cpython-38-x86_64-linux-gnu.so: undefined symbol: _ZN3c106SymInt8toSymIntENS_13intrusive_ptrINS_14SymIntNodeImplENS_6detail34intrusive_target_default_null_typeIS2_EEEE