Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
52 changes: 33 additions & 19 deletions .github/actions/install-ci-dependencies/action.yml
Original file line number Diff line number Diff line change
Expand Up @@ -37,28 +37,37 @@ runs:
# - requirements.txt excludes torch (dependencies resolved against nightly)
# - Skip on macOS ARM64 - nightly +cpu wheels aren't available, use stable torch
# - Skip for "oldest" builds - they use stable torch from lock file
- name: Install torch nightly
- name: Install torch prerelease (if configured)
if: inputs.use_oldest != 'true'
shell: bash
run: |
NIGHTLY_FILE="requirements/ci/torch-nightly.txt"
if [[ -f "${NIGHTLY_FILE}" ]]; then
TORCH_VERSION=$(grep -v '^#' "${NIGHTLY_FILE}" | grep -v '^$' | head -1 || true)
if [[ -n "${TORCH_VERSION}" ]]; then
PRE_FILE="requirements/ci/torch-pre.txt"
if [[ -f "${PRE_FILE}" ]]; then
# Read 3-line config: version, CUDA target, channel type
# Use while loop for bash 3.2 compatibility (macOS)
PRE_CONFIG=()
while IFS= read -r line; do
PRE_CONFIG+=("$line")
done < <(grep -v '^#' "${PRE_FILE}" | grep -v '^$')
TORCH_VERSION="${PRE_CONFIG[0]}"
CUDA_TARGET="${PRE_CONFIG[1]}"
CHANNEL_TYPE="${PRE_CONFIG[2]}"

if [[ -n "${TORCH_VERSION}" && -n "${CHANNEL_TYPE}" ]]; then
if [[ "${{ runner.os }}" == "macOS" ]]; then
echo "Skipping torch nightly on macOS ARM64 (no +cpu wheels available)"
echo "Skipping torch ${CHANNEL_TYPE} on macOS ARM64 (no +cpu wheels available)"
echo "Will install stable torch with --torch-backend=auto"
else
echo "Installing torch nightly: ${TORCH_VERSION}+cpu"
echo "Installing torch ${CHANNEL_TYPE}: ${TORCH_VERSION}+cpu"
uv pip install --prerelease=allow "torch==${TORCH_VERSION}+cpu" \
--index-url https://download.pytorch.org/whl/nightly/cpu
--index-url "https://download.pytorch.org/whl/${CHANNEL_TYPE}/cpu"
fi
fi
fi

# Install FTS and all dependencies
# - UV_OVERRIDE: applies Lightning commit pin from overrides.txt
# - When nightly configured (and not macOS): torch already installed, just install rest
# - When prerelease configured (and not macOS): torch already installed, just install rest
# - For macOS: use --torch-backend=auto for MPS-compatible stable torch
# - For stable builds: use --torch-backend=cpu
- name: Install FTS and all dependencies
Expand All @@ -74,27 +83,32 @@ runs:
echo "Installing with latest versions..."
fi

# Check if torch nightly is configured
NIGHTLY_FILE="requirements/ci/torch-nightly.txt"
TORCH_NIGHTLY="false"
if [[ -f "${NIGHTLY_FILE}" ]]; then
TORCH_VERSION=$(grep -v '^#' "${NIGHTLY_FILE}" | grep -v '^$' | head -1 || true)
# Check if torch prerelease is configured
PRE_FILE="requirements/ci/torch-pre.txt"
TORCH_PRERELEASE="false"
if [[ -f "${PRE_FILE}" ]]; then
# Use while loop for bash 3.2 compatibility (macOS)
PRE_CONFIG=()
while IFS= read -r line; do
PRE_CONFIG+=("$line")
done < <(grep -v '^#' "${PRE_FILE}" | grep -v '^$')
TORCH_VERSION="${PRE_CONFIG[0]}"
if [[ -n "${TORCH_VERSION}" ]]; then
TORCH_NIGHTLY="true"
TORCH_PRERELEASE="true"
fi
fi

# Determine install command based on platform and torch configuration
# - Oldest builds: use --torch-backend=cpu for stable torch
# - macOS: use --torch-backend=auto (MPS), nightly not available
# - Linux/Windows with nightly: torch already installed, no backend flag needed
# - macOS: use --torch-backend=auto (MPS), prerelease not available
# - Linux/Windows with prerelease: torch already installed, no backend flag needed
# - Linux/Windows with stable: use --torch-backend=cpu
if [[ "${{ inputs.use_oldest }}" == "true" ]]; then
INSTALL_CMD="uv pip install -e . -r ${REQ_FILE} --torch-backend=cpu"
elif [[ "${{ runner.os }}" == "macOS" ]]; then
INSTALL_CMD="uv pip install -e . -r ${REQ_FILE} --torch-backend=auto"
elif [[ "${TORCH_NIGHTLY}" == "true" ]]; then
# Torch nightly already installed, just install FTS and deps
elif [[ "${TORCH_PRERELEASE}" == "true" ]]; then
# Torch prerelease already installed, just install FTS and deps
INSTALL_CMD="uv pip install -e . -r ${REQ_FILE}"
else
INSTALL_CMD="uv pip install -e . -r ${REQ_FILE} --torch-backend=cpu"
Expand Down
37 changes: 37 additions & 0 deletions .github/copilot-instructions.md
Original file line number Diff line number Diff line change
Expand Up @@ -199,6 +199,43 @@ source ${FTS_VENV_BASE}/${FTS_TARGET_VENV}/bin/activate
PL_RUN_STANDALONE_TESTS=1 python -m pytest tests/test_specific.py::test_function -v
```

### Building Documentation

**Documentation build commands (needs activated venv):**

```bash
export FTS_VENV_BASE=/mnt/cache/${USER}/.venvs
export FTS_TARGET_VENV=fts_latest
export FTS_REPO_DIR=${HOME}/repos/finetuning-scheduler # Example: adjust to your local repo path
# Activate your environment first
cd ${FTS_REPO_DIR} && \
source ${FTS_VENV_BASE}/${FTS_TARGET_VENV}/bin/activate

# Clean previous builds
cd docs && make clean

# Build HTML documentation with warnings as errors
make html --debug SPHINXOPTS="-W --keep-going"

# Run linkcheck to verify all links
make linkcheck SPHINXOPTS="-W --keep-going"

# Check for errors in linkcheck output
grep -i "error\|broken" build/linkcheck/output.txt || echo "No errors found in linkcheck"
```

**Documentation requirements:**

- All documentation must build without warnings when using `-W` flag
- All internal and external links must be valid (verified by linkcheck)
- RST cross-references should use appropriate directives:
- `:class:` for class references
- `:meth:` for method references
- `:func:` for function references
- `:doc:` for document references
- `:ref:` for section references (requires explicit label like `.. _label_name:`)
- Sphinx autosummary generates API documentation from docstrings

## Project Layout and Architecture

### Source Code Structure
Expand Down
3 changes: 2 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,8 @@ timit_data/
grid_generated*
grid_ori*


# we don't have repo-specific prompts at this juncture
.github/prompts/

# C extensions
*.so
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -226,7 +226,7 @@ See the [versioning documentation](https://finetuning-scheduler.readthedocs.io/e
<details>
<summary>Current build statuses for Fine-Tuning Scheduler </summary>

| System / (PyTorch/Python ver) | 2.6.0/3.9 | 2.10.0/3.9, 2.10.0/3.12 |
| System / (PyTorch/Python ver) | 2.6.0/3.10 | 2.10.0/3.10, 2.10.0/3.12 |
| :---------------------------: | :----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: | :----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
| Linux \[GPUs\*\*\] | - | [![Build Status](https://dev.azure.com//speediedan/finetuning-scheduler/_apis/build/status/Multi-GPU%20&%20Example%20Tests?branchName=main)](https://dev.azure.com/speediedan/finetuning-scheduler/_build/latest?definitionId=1&branchName=main) |
| Linux (Ubuntu 22.04) | [![Test](https://github.com/speediedan/finetuning-scheduler/actions/workflows/ci_test-full.yml/badge.svg?branch=main&event=push)](https://github.com/speediedan/finetuning-scheduler/actions/workflows/ci_test-full.yml) | [![Test](https://github.com/speediedan/finetuning-scheduler/actions/workflows/ci_test-full.yml/badge.svg?branch=main&event=push)](https://github.com/speediedan/finetuning-scheduler/actions/workflows/ci_test-full.yml) |
Expand Down
8 changes: 3 additions & 5 deletions dockers/base-cuda/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -81,13 +81,11 @@ RUN \
else \
# or target a specific cuda build, by specifying a particular index url w/...
# ... default channel
# uv pip install torch torchvision --index-url https://download.pytorch.org/whl/cu128; \
# ... pytorch patch version
# uv pip install torch==1.11.1+cu113 torchvision==0.11.3+cu113 -f https://download.pytorch.org/whl/cu113/torch_stable.html; \
# uv pip install torch --torch-backend=cu128; \
# ... pytorch nightly dev version
uv pip install --prerelease=allow torch==2.10.0.dev20251124 --index-url https://download.pytorch.org/whl/nightly/cu128; \
# uv pip install --prerelease=allow torch==2.10.0.dev20251124 --index-url https://download.pytorch.org/whl/nightly/cu128; \
# ... test channel
# uv pip install --prerelease=allow torch==2.10.0 --index-url https://download.pytorch.org/whl/test/cu128; \
uv pip install --prerelease=allow torch==2.10.0 --index-url https://download.pytorch.org/whl/test/cu128; \
fi && \
# We avoid installing Lightning and other dependencies here as they are usually upgraded anyway later in
# CI but we may re-enable in the future.
Expand Down
4 changes: 0 additions & 4 deletions docs/source/distributed/fsdp_scheduled_fine_tuning.rst
Original file line number Diff line number Diff line change
Expand Up @@ -309,10 +309,6 @@ While not technically required, we add ``DebertaV2Embeddings`` separately as wel
As always, if needed, one can alternatively override ``configure_model`` and manually wrap a given
:external+pl:class:`~lightning.pytorch.core.module.LightningModule` to align with a desired fine-tuning schedule.

.. warning::

:class:`~finetuning_scheduler.strategy_adapters.FSDPStrategyAdapter` is in BETA and subject to change. The
interface can bring breaking changes and new features with the next release of PyTorch.

.. note::

Expand Down
7 changes: 7 additions & 0 deletions docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -523,6 +523,13 @@ Footnotes
advanced/lr_scheduler_reinitialization
advanced/optimizer_reinitialization

.. toctree::
:maxdepth: 1
:name: Plugins
:caption: Plugins

plugins/strategy_adapter_entry_points

.. toctree::
:maxdepth: 1
:name: Basic Examples
Expand Down
Loading