diff --git a/README.md b/README.md index d208c8f6..2febf90c 100644 --- a/README.md +++ b/README.md @@ -42,8 +42,8 @@ The benchmark is defined in the [benchmark](benchmark/) folder, where there exis There are two games to benchmark: supervised and zero-shot. Each game has its selected list of models and datasets defined in `dvc.yaml`. -- Supervised game is defined in this [dvc.yaml](supervised/local/dvc.yaml) -- Zero-shot game is defined in this [dvc.yaml](zero_shot/local/dvc.yaml) +- Supervised game is defined in this [dvc.yaml](benchmark/supervised/local/dvc.yaml) +- Zero-shot game is defined in this [dvc.yaml](benchmark/zero_shot/local/dvc.yaml) The models and datasets are defined in `vars` at the top, and DVC translates `vars` into a matrix, which is namely a loop defined as the following pseudo-code: @@ -89,12 +89,13 @@ The difference of the AWS environment is that: > c. SSO region: `us-east-1`. > d. SSO registration scopes: Leave empty. > e. Login via browser. -> 2. Select the account: `ifflabdev`. +> 3. Select the account: `ifflabdev`. > a. Default client Region is `us-east-1`. > b. CLI default ouptut: Leave empty. > c. Profile name: `pg2benchmark`. > 4. You can find your account ID and profile by executing `cat ~/.aws/config`. > 5. Finally, you can run `dvc repro` with environment variables in each game: `AWS_ACCOUNT_ID=xxx AWS_PROFILE=yyy dvc repro` +> 6. After first setup you can authenticate through CLI: `aws sso login --profile ` #### Supervised diff --git a/models/README.md b/models/README.md index c067dae7..d792e683 100644 --- a/models/README.md +++ b/models/README.md @@ -204,3 +204,71 @@ scores.to_csv( index=False, ) ``` + +## Building the Dockerfile + +A Dockerfile is a text file that contains instructions for building a Docker image - think of it as a recipe that tells Docker how to create a consistent, isolated environment for your model. This ensures your model runs the same way across different machines and environments. Docker solves the "it works on my machine" problem, allowing to run models identically on various hardware and configurations for optimal reproducibility. + +### Basic Dockerfile Structure + +Every model needs a Dockerfile that follows this pattern: + +```dockerfile +# 1. Start with a base Python image +FROM python:3.12-slim-bookworm + +# 2. Install system dependencies +RUN apt-get update && apt-get install -y --no-install-recommends \ + ca-certificates \ + git \ + && apt-get clean \ + && rm -rf /var/lib/apt/lists/* + +# 3. Install uv (fast Python package manager) +COPY --from=ghcr.io/astral-sh/uv:latest /uv /uvx /bin/ + +# 4. Set working directory +# We specifically use /opt/program as AWS expect the files to be present in this location. +WORKDIR /opt/program + +# 5. Copy benchmark framework +COPY ./README.md ./pg2-benchmark/README.md +COPY ./pyproject.toml ./pg2-benchmark/pyproject.toml +COPY ./src ./pg2-benchmark/src + +# 6. Copy your model's configuration +COPY ./models/YOUR_MODEL/README.md ./README.md +COPY ./models/YOUR_MODEL/pyproject.toml ./pyproject.toml + +# 7. Handle private repository access (if needed) +ARG GIT_CACHE_BUST=1 +RUN --mount=type=secret,id=git_auth \ + git config --global credential.helper store && \ + cat /run/secrets/git_auth > ~/.git-credentials && \ + chmod 600 ~/.git-credentials + +# 8. Install Python dependencies +RUN uv sync --no-cache + +# 9. Copy your model's source code +COPY ./models/YOUR_MODEL/src ./src + +# 10. Set the entry point +ENTRYPOINT ["uv", "run", "pg2-model"] +``` + +### Building and Testing + +To build your Docker image: + +```bash +# From the project root directory +docker build -f models/YOUR_MODEL/Dockerfile -t your-model . +``` + +To test it locally: + +```bash +docker run --rm your-model train --help +``` +