ProteinGym · karel-w · Sep 29, 2025 · Sep 29, 2025 · tintinrevient · Sep 29, 2025
diff --git a/README.md b/README.md
@@ -42,8 +42,8 @@ The benchmark is defined in the [benchmark](benchmark/) folder, where there exis
 
 There are two games to benchmark: supervised and zero-shot. Each game has its selected list of models and datasets defined in `dvc.yaml`.
 
-- Supervised game is defined in this [dvc.yaml](supervised/local/dvc.yaml)
-- Zero-shot game is defined in this [dvc.yaml](zero_shot/local/dvc.yaml)
+- Supervised game is defined in this [dvc.yaml](benchmark/supervised/local/dvc.yaml)
+- Zero-shot game is defined in this [dvc.yaml](benchmark/zero_shot/local/dvc.yaml)
 
 The models and datasets are defined in `vars` at the top, and DVC translates `vars` into a matrix, which is namely a loop defined as the following pseudo-code:
 
@@ -89,12 +89,13 @@ The difference of the AWS environment is that:
 >   c. SSO region: `us-east-1`.
 >   d. SSO registration scopes: Leave empty.
 >   e. Login via browser.
-> 2. Select the account: `ifflabdev`.
+> 3. Select the account: `ifflabdev`.
 >   a. Default client Region is `us-east-1`.
 >   b. CLI default ouptut: Leave empty.
 >   c. Profile name: `pg2benchmark`.
 > 4. You can find your account ID and profile by executing `cat ~/.aws/config`.
 > 5. Finally, you can run `dvc repro` with environment variables in each game: `AWS_ACCOUNT_ID=xxx AWS_PROFILE=yyy dvc repro`
+> 6. After first setup you can authenticate through CLI: `aws sso login --profile <profile_name>`
 
 #### Supervised
 

diff --git a/models/README.md b/models/README.md
@@ -204,3 +204,71 @@ scores.to_csv(
     index=False,
 )
 ```
+
+## Building the Dockerfile
+
+A Dockerfile is a text file that contains instructions for building a Docker image - think of it as a recipe that tells Docker how to create a consistent, isolated environment for your model. This ensures your model runs the same way across different machines and environments. Docker solves the "it works on my machine" problem, allowing to run models identically on various hardware and configurations for optimal reproducibility. 
+
+### Basic Dockerfile Structure
+
+Every model needs a Dockerfile that follows this pattern:
+
+```dockerfile
+# 1. Start with a base Python image
+FROM python:3.12-slim-bookworm
+
+# 2. Install system dependencies
+RUN apt-get update && apt-get install -y --no-install-recommends \
+    ca-certificates \
+    git \
+    && apt-get clean \
+    && rm -rf /var/lib/apt/lists/*
+
+# 3. Install uv (fast Python package manager)
+COPY --from=ghcr.io/astral-sh/uv:latest /uv /uvx /bin/
+
+# 4. Set working directory
+# We specifically use /opt/program as AWS expect the files to be present in this location.
+WORKDIR /opt/program
+
+# 5. Copy benchmark framework
+COPY ./README.md ./pg2-benchmark/README.md
+COPY ./pyproject.toml ./pg2-benchmark/pyproject.toml
+COPY ./src ./pg2-benchmark/src
+
+# 6. Copy your model's configuration
+COPY ./models/YOUR_MODEL/README.md ./README.md
+COPY ./models/YOUR_MODEL/pyproject.toml ./pyproject.toml
+
+# 7. Handle private repository access (if needed)
+ARG GIT_CACHE_BUST=1
+RUN --mount=type=secret,id=git_auth \
+    git config --global credential.helper store && \
+    cat /run/secrets/git_auth > ~/.git-credentials && \
+    chmod 600 ~/.git-credentials
+
+# 8. Install Python dependencies
+RUN uv sync --no-cache
+
+# 9. Copy your model's source code
+COPY ./models/YOUR_MODEL/src ./src
+
+# 10. Set the entry point
+ENTRYPOINT ["uv", "run", "pg2-model"]
+```
+
+### Building and Testing
+
+To build your Docker image:
+
+```bash
+# From the project root directory
+docker build -f models/YOUR_MODEL/Dockerfile -t your-model .
 - docker build --build-arg GIT_CACHE_BUST=${git.git_cache_bust} --secret id=git_auth,src=../git-auth.txt -f ${item.model.dockerfile} -t ${item.model.name}:latest ../../.. 
 - docker build --build-arg GIT_CACHE_BUST=${git.git_cache_bust} --secret id=git_auth,src=../git-auth.txt -f ${item.model.dockerfile} -t ${item.model.name}:latest ../../.. 
+```
+
+To test it locally:
+
+```bash
+docker run --rm your-model train --help
 - docker run --rm -v $(realpath ${source.datasets_dir}):/datasets -v $(realpath ${source.models_dir}):/models -v $(realpath ${destination.output_dir}):/opt/ml/model ${item.model.name}:latest train --dataset-file ${item.dataset.container_path} --model-card-file ${item.model.container_path} 
 - docker run --rm -v $(realpath ${source.datasets_dir}):/datasets -v $(realpath ${source.models_dir}):/models -v $(realpath ${destination.output_dir}):/opt/ml/model ${item.model.name}:latest train --dataset-file ${item.dataset.container_path} --model-card-file ${item.model.container_path} 
+```
+