Skip to content

feat: add Dockerfile for containerized fine-tuning#114

Open
abdelhadi703 wants to merge 1 commit intomistralai:mainfrom
abdelhadi703:feat/dockerfile
Open

feat: add Dockerfile for containerized fine-tuning#114
abdelhadi703 wants to merge 1 commit intomistralai:mainfrom
abdelhadi703:feat/dockerfile

Conversation

@abdelhadi703
Copy link

Summary

Adds a Dockerfile for containerized fine-tuning, addressing recurring environment setup issues (#98, #109, #92).

What's included

  • Dockerfile: Based on pytorch/pytorch:2.2.0-cuda12.1-cudnn8-devel with all dependencies pre-installed
  • .dockerignore: Excludes unnecessary files from the build context
  • README update: Docker usage instructions (single-GPU and multi-GPU)

Usage

# Build
docker build -t mistral-finetune .

# Single GPU
docker run --gpus all -v /data:/data -v /model:/model mistral-finetune --config /data/config.yaml

# Multi-GPU (torchrun)
docker run --gpus all --entrypoint torchrun mistral-finetune \
  --nproc_per_node=4 /app/train.py --config /data/config.yaml

Design decisions

  • Base image: pytorch/pytorch:2.2.0-cuda12.1-cudnn8-devel matches the pinned torch==2.2 requirement and includes CUDA development headers needed for xformers compilation
  • Volume mounts: Training data and model weights are mounted at runtime (not baked into the image) for flexibility
  • Entrypoint: Defaults to python -m train for simplicity; override with torchrun for distributed training

Related issues

Add Dockerfile based on pytorch/pytorch:2.2.0-cuda12.1-cudnn8-devel
with all required dependencies (torch 2.2, triton, xformers).
Includes .dockerignore and README documentation for single-GPU
and multi-GPU (torchrun) usage with volume mounts.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant