Skip to content

Quick start and Installation errors in docker container #389

@scoleri-mr

Description

@scoleri-mr

I have been trying for a while to create a docker container in which to run nanotron. So far I have not even been able to run the quickstart even though I am trying to follow as closely as possible the installation guide in the readme. This is the docker file I am using currently, after a lot of different experiments:

FROM nvidia/cuda:12.4.1-devel-ubuntu22.04

ARG DEBIAN_FRONTEND=noninteractive

# System deps
RUN apt-get update && apt-get install -y --no-install-recommends \
    git git-lfs \
    build-essential \
    curl ca-certificates \
    python3 python3-venv python3-pip python3.11-dev \
    && rm -rf /var/lib/apt/lists/*

# uv
RUN curl -LsSf https://astral.sh/uv/install.sh | sh && \
    ln -s /root/.local/bin/uv /usr/local/bin/uv

RUN update-alternatives --install /usr/bin/python python /usr/bin/python3 1
ENV PIP_NO_CACHE_DIR=1

# Deps
RUN uv pip install --system torch --index-url https://download.pytorch.org/whl/cu124 && \
    uv pip install --system datasets transformers "datatrove[io]" numba wandb && \
    uv pip install --system ninja triton "flash_attn==2.7.4.post1" --no-build-isolation && \
    uv pip install --system git+https://github.com/huggingface/nanotron.git@nanotron-working-branch

# Ensure git-lfs is set up (no interactive login)
RUN uv pip install --system psutil

WORKDIR /app
COPY . /app
ENV PYTHONPATH=/app:${PYTHONPATH}

CMD ["/bin/bash"]

I have tried several variations of this: leaving flash-attn>=2.5.0 as in the guide, fixing torch version to 2.4.x, I have tried the main branch, the smollm3 branch and the nanotron-working-branch.

I have overcome several errors with the different modifications and with this dockerfile I have gotten the furthest but I am currently stuck with this error while running run_train.py as explained in the readme with tiny llama config:

AttributeError: 'ModelArgs' object has no attribute 'model'

Is there something wrong in the docker container or is it something else?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions