Skip to content

Commit aa750cf

Browse files
authored
Merge pull request #17 from olipinski/logdir
1. **`fts.py`**: - Introduced a `log_dir` parameter to allow specifying a custom directory for artifacts, defaulting to `trainer.log_dir` or `trainer.default_root_dir`. - Added a `log_dir` property to determine the directory dynamically. - Updated various methods to use the new `log_dir` property instead of directly accessing `trainer.log_dir`. 2. **`fts_supporters.py`**: - Adjusted methods to utilize the `log_dir` property for saving schedules and validating configurations. 3. **`test_finetuning_scheduler_callback.py`**: - Added a test to validate the behavior of `FinetuningScheduler` with and without a specified `log_dir` when using a logger without a `save_dir`. I used `MLFlowLogger` as one such test case of the more general pattern. Arguably, Lightning should handle this scenario by updating its `trainer.log_dir` resolution logic to accommodate artifact persistence via `trainer.log_dir` for loggers that do not have a `save_dir` set (using `trainer.default_root_dir`) but your PR was an excellent opportunity to refactor and enhance FTS `log_dir` handling.
2 parents 40e4836 + 2f088ef commit aa750cf

File tree

28 files changed

+575
-166
lines changed

28 files changed

+575
-166
lines changed

.azure-pipelines/gpu-tests.yml

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -92,10 +92,10 @@ jobs:
9292
bash ./tests/special_tests.sh --mark_type=standalone --filter_pattern='test_f'
9393
displayName: 'Testing: standalone multi-gpu'
9494
95-
- bash: |
96-
. /tmp/venvs/fts_dev/bin/activate
97-
bash ./tests/special_tests.sh --mark_type=exp_patch --filter_pattern='test_f' --experiment_patch_mask="1 0 0 1"
98-
displayName: 'Testing: experimental einsum patch'
95+
# - bash: |
96+
# . /tmp/venvs/fts_dev/bin/activate
97+
# bash ./tests/special_tests.sh --mark_type=exp_patch --filter_pattern='test_f' --experiment_patch_mask="1 0 0 1"
98+
# displayName: 'Testing: Experimental Multi-GPU'
9999

100100
- bash: |
101101
. /tmp/venvs/fts_dev/bin/activate

.gitignore

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,9 @@ lightning_logs/
99
# ignore ipynb files themselves since we want to only store the source *.py
1010
*.ipynb
1111

12+
# SQLite database files
13+
*.db
14+
1215
# Test-tube
1316
test_tube_*/
1417

CHANGELOG.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -268,7 +268,7 @@ name/pattern-based configuration instead of manually inspecting modules and appl
268268

269269
### Added
270270

271-
- **FSDP Scheduled Fine-Tuning** is now supported! [See the tutorial here.](https://finetuning-scheduler.readthedocs.io/en/stable/advanced/fsdp_scheduled_fine_tuning.html)
271+
- **FSDP Scheduled Fine-Tuning** is now supported! [See the tutorial here.](https://finetuning-scheduler.readthedocs.io/en/stable/distributed/fsdp_scheduled_fine_tuning.html)
272272
- Introduced [``StrategyAdapter``](https://finetuning-scheduler.readthedocs.io/en/stable/api/finetuning_scheduler.strategy_adapters.html#finetuning_scheduler.strategy_adapters.StrategyAdapter)s. If you want to extend Fine-Tuning Scheduler (FTS) to use a custom, currently unsupported strategy or override current FTS behavior in the context of a given training strategy, subclassing ``StrategyAdapter`` is now a way to do so. See [``FSDPStrategyAdapter``](https://finetuning-scheduler.readthedocs.io/en/stable/api/finetuning_scheduler.strategy_adapters.html#finetuning_scheduler.strategy_adapters.FSDPStrategyAdapter) for an example implementation.
273273
- support for `pytorch-lightning` 1.9.0
274274

CITATION.cff

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -125,6 +125,9 @@ identifiers:
125125
- description: "Fine-Tuning Scheduler (v2.5.0)"
126126
type: doi
127127
value: 10.5281/zenodo.14537830
128+
- description: "Fine-Tuning Scheduler (v2.5.1)"
129+
type: doi
130+
value: 10.5281/zenodo.15099039
128131
license: "Apache-2.0"
129132
url: "https://finetuning-scheduler.readthedocs.io/"
130133
repository-code: "https://github.com/speediedan/finetuning-scheduler"

dockers/base-cuda/Dockerfile

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -90,9 +90,9 @@ RUN \
9090
# ... pytorch nightly dev version
9191
#pip install --pre torch==2.7.0.dev20250201 torchvision==0.22.0.dev20250201 --index-url https://download.pytorch.org/whl/nightly/cu128; \
9292
# temporarily remove torchvision from the nightly build until it supports cu128 in nightlies
93-
pip install --pre torch==2.7.0.dev20250201 --index-url https://download.pytorch.org/whl/nightly/cu128; \
93+
#pip install --pre torch==2.7.0.dev20250201 --index-url https://download.pytorch.org/whl/nightly/cu128; \
9494
# ... test channel
95-
#pip install --pre torch==2.7.0 torchvision==0.22.0 --index-url https://download.pytorch.org/whl/test/cu128; \
95+
pip install --pre torch==2.7.0 torchvision==0.22.0 --index-url https://download.pytorch.org/whl/test/cu128; \
9696
fi && \
9797
# Install all requirements
9898
pip install -r requirements/devel.txt --no-cache-dir && \

docs/source/conf.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -96,6 +96,8 @@ def _transform_changelog(path_in: str, path_out: str) -> None:
9696
"sphinx_togglebutton",
9797
]
9898

99+
autodoc_typehints = "none"
100+
99101
# Suppress warnings about duplicate labels (needed for PL tutorials)
100102
suppress_warnings = [
101103
"autosectionlabel.*",

docs/source/index.rst

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -100,9 +100,9 @@ thawed/unfrozen parameter groups associated with each fine-tuning phase as desir
100100
and executed in ascending order. In addition to being zero-indexed, fine-tuning phase keys should be contiguous and
101101
either integers or convertible to integers via ``int()``.
102102

103-
1. First, generate the default schedule to ``Trainer.log_dir``. It will be named after your
104-
:external+pl:class:`~lightning.pytorch.core.module.LightningModule` subclass with the suffix
105-
``_ft_schedule.yaml``.
103+
1. First, generate the default schedule (output to :paramref:`~finetuning_scheduler.fts.FinetuningScheduler.log_dir`,
104+
defaults to ``Trainer.log_dir``). It will be named after your
105+
:external+pl:class:`~lightning.pytorch.core.module.LightningModule` subclass with the suffix ``_ft_schedule.yaml``.
106106

107107
.. code-block:: python
108108

pyproject.toml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -7,11 +7,11 @@ requires = [
77
[tool.ruff]
88
line-length = 120
99
# Enable Pyflakes `E` and `F` codes by default.
10-
select = [
10+
lint.select = [
1111
"E", "W", # see: https://pypi.org/project/pycodestyle
1212
"F", # see: https://pypi.org/project/pyflakes
1313
]
14-
ignore = [
14+
lint.ignore = [
1515
"E731", # Do not assign a lambda expression, use a def
1616
]
1717
# Exclude a variety of commonly ignored directories.
@@ -23,7 +23,7 @@ exclude = [
2323
"build",
2424
"temp",
2525
]
26-
ignore-init-module-imports = true
26+
lint.ignore-init-module-imports = true
2727
output-format = "pylint"
2828

2929
[tool.ruff.per-file-ignores]

requirements/base.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
11
#lightning>=2.7.0,<2.7.1
22
# the below is uncommented when master is targeting a specific pl dev master commit
3-
git+https://github.com/Lightning-AI/lightning.git@efe311cd46a372aeb5912ea5adfeef573a5d64ca#egg=lightning
3+
git+https://github.com/Lightning-AI/lightning.git@669486afdd524fb66c1afc36bf93955384ac1224#egg=lightning
44
torch>=2.4.0

requirements/docs.txt

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -3,8 +3,8 @@ myst-parser==0.18.1
33
nbsphinx>=0.8.5
44
pandoc>=1.0
55
docutils>=0.16
6-
sphinxcontrib-fulltoc>=1.0
7-
sphinxcontrib-mockautodoc
6+
#sphinxcontrib-fulltoc>=1.0
7+
#sphinxcontrib-mockautodoc
88
sphinx-autodoc-typehints>=1.16
99
sphinx-paramlinks>=0.5.1
1010
sphinx-togglebutton>=0.2

0 commit comments

Comments
 (0)