-
Notifications
You must be signed in to change notification settings - Fork 0
small fixes README #113
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
small fixes README #113
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||
|---|---|---|---|---|
|
|
@@ -204,3 +204,71 @@ scores.to_csv( | |||
| index=False, | ||||
| ) | ||||
| ``` | ||||
|
|
||||
| ## Building the Dockerfile | ||||
|
|
||||
| A Dockerfile is a text file that contains instructions for building a Docker image - think of it as a recipe that tells Docker how to create a consistent, isolated environment for your model. This ensures your model runs the same way across different machines and environments. Docker solves the "it works on my machine" problem, allowing to run models identically on various hardware and configurations for optimal reproducibility. | ||||
|
|
||||
| ### Basic Dockerfile Structure | ||||
|
|
||||
| Every model needs a Dockerfile that follows this pattern: | ||||
|
|
||||
| ```dockerfile | ||||
| # 1. Start with a base Python image | ||||
| FROM python:3.12-slim-bookworm | ||||
|
|
||||
| # 2. Install system dependencies | ||||
| RUN apt-get update && apt-get install -y --no-install-recommends \ | ||||
| ca-certificates \ | ||||
| git \ | ||||
| && apt-get clean \ | ||||
| && rm -rf /var/lib/apt/lists/* | ||||
|
|
||||
| # 3. Install uv (fast Python package manager) | ||||
| COPY --from=ghcr.io/astral-sh/uv:latest /uv /uvx /bin/ | ||||
|
|
||||
| # 4. Set working directory | ||||
| # We specifically use /opt/program as AWS expect the files to be present in this location. | ||||
| WORKDIR /opt/program | ||||
|
|
||||
| # 5. Copy benchmark framework | ||||
| COPY ./README.md ./pg2-benchmark/README.md | ||||
| COPY ./pyproject.toml ./pg2-benchmark/pyproject.toml | ||||
| COPY ./src ./pg2-benchmark/src | ||||
|
|
||||
| # 6. Copy your model's configuration | ||||
| COPY ./models/YOUR_MODEL/README.md ./README.md | ||||
| COPY ./models/YOUR_MODEL/pyproject.toml ./pyproject.toml | ||||
|
|
||||
| # 7. Handle private repository access (if needed) | ||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this step will be removed as well, when proteingym-base is public. |
||||
| ARG GIT_CACHE_BUST=1 | ||||
| RUN --mount=type=secret,id=git_auth \ | ||||
| git config --global credential.helper store && \ | ||||
| cat /run/secrets/git_auth > ~/.git-credentials && \ | ||||
| chmod 600 ~/.git-credentials | ||||
|
|
||||
| # 8. Install Python dependencies | ||||
| RUN uv sync --no-cache | ||||
|
|
||||
| # 9. Copy your model's source code | ||||
| COPY ./models/YOUR_MODEL/src ./src | ||||
|
|
||||
| # 10. Set the entry point | ||||
| ENTRYPOINT ["uv", "run", "pg2-model"] | ||||
| ``` | ||||
|
|
||||
| ### Building and Testing | ||||
|
|
||||
| To build your Docker image: | ||||
|
|
||||
| ```bash | ||||
| # From the project root directory | ||||
| docker build -f models/YOUR_MODEL/Dockerfile -t your-model . | ||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. based on the current settings: proteingym-base (a.k.a. pg2-dataset) is private, the secret is needed to build the image. docker build --secret id=git_auth,src=git-auth.txt ...Reference is here: https://github.com/ProteinGym/pg2-model-esm The latest working reference is in this line in local DVC:
|
||||
| ``` | ||||
|
|
||||
| To test it locally: | ||||
|
|
||||
| ```bash | ||||
| docker run --rm your-model train --help | ||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The volumes need to be attached. The latest working reference is in this line in local DVC:
|
||||
| ``` | ||||
|
|
||||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This step will be changed, after benchmark is public, so the doc needs to be updated accordingly.