Delete skyrl, move everything up a level, and delete skyrl-train and skyrl-tx so that everything is renamed#1135
Conversation
<!-- devin-review-badge-begin --> --- <a href="https://app.devin.ai/review/novasky-ai/skyrl/pull/1132" target="_blank"> <picture> <source media="(prefers-color-scheme: dark)" srcset="https://static.devin.ai/assets/gh-open-in-devin-review-dark.svg?v=1"> <img src="https://static.devin.ai/assets/gh-open-in-devin-review-light.svg?v=1" alt="Open with Devin"> </picture> </a> <!-- devin-review-badge-end -->
There was a problem hiding this comment.
Code Review
This pull request is a significant and valuable refactoring that unifies the skyrl-train and skyrl-tx packages into a single skyrl package at the root level. This greatly improves the project's structure and maintainability. The extensive updates to pyproject.toml introduce a modern, robust packaging setup using uv with well-defined optional dependencies for various backends. My review focuses on the new configuration files, where I've identified some opportunities for improvement. I've pointed out a few duplicate entries in the .gitignore file and have made several suggestions for pyproject.toml to enhance flexibility, clarity, and prevent potential performance issues. Overall, this is an excellent step forward for the project.
| transformer-engine-torch = [{requirement = "torch", match-runtime = true}, "build_tools", "ninja"] | ||
|
|
||
| [tool.uv.extra-build-variables] | ||
| flash-attn = { FLASH_ATTENTION_SKIP_CUDA_BUILD = "TRUE"} |
There was a problem hiding this comment.
Setting FLASH_ATTENTION_SKIP_CUDA_BUILD = "TRUE" is concerning. This will likely prevent flash-attn's CUDA extensions from being built, causing it to fall back to a much slower pure Python implementation and negating its performance benefits. If the goal is to use pre-built wheels, that should be configured through [tool.uv.sources] or by pinning to a specific wheel URL. Could you clarify the reasoning for this build variable? Without the CUDA extensions, there could be a significant performance regression.
| # Byte-compiled / optimized / DLL files | ||
| __pycache__/ | ||
| *.py[cod] | ||
| *$py.class | ||
|
|
||
| # C extensions | ||
| *.so | ||
|
|
||
| # Distribution / packaging | ||
| .Python | ||
| build/ | ||
| develop-eggs/ | ||
| dist/ | ||
| downloads/ | ||
| eggs/ | ||
| .eggs/ | ||
| lib/ | ||
| lib64/ | ||
| parts/ | ||
| sdist/ | ||
| var/ | ||
| wheels/ | ||
| pip-wheel-metadata/ | ||
| share/python-wheels/ | ||
| *.egg-info/ | ||
| .installed.cfg | ||
| *.egg | ||
| MANIFEST | ||
| uv.lock | ||
|
|
||
| # PyInstaller | ||
| *.manifest | ||
| *.spec | ||
|
|
||
| # Installer logs | ||
| pip-log.txt | ||
| pip-delete-this-directory.txt | ||
|
|
||
| # Unit test / coverage reports | ||
| htmlcov/ | ||
| .tox/ | ||
| .nox/ | ||
| .coverage | ||
| .coverage.* | ||
| .cache | ||
| nosetests.xml | ||
| coverage.xml | ||
| *.cover | ||
| *.py,cover | ||
| .hypothesis/ | ||
| .pytest_cache/ | ||
|
|
||
| # Jupyter Notebook | ||
| .ipynb_checkpoints | ||
|
|
||
| # Environments | ||
| .env | ||
| .venv | ||
| env/ | ||
| venv/ | ||
| ENV/ | ||
| env.bak/ | ||
| venv.bak/ | ||
|
|
||
| # MkDocs build output | ||
| site/ | ||
|
|
||
| # IDEs and editors | ||
| .idea/ | ||
| .vscode/ | ||
|
|
||
| # OS generated files | ||
| .DS_Store | ||
| Thumbs.db | ||
|
|
||
| # Hydra outputs | ||
| outputs/ | ||
|
|
||
| # Local artifacts | ||
| tinker.db | ||
| uv.lock | ||
|
|
||
| # Alembic - don't track pycache | ||
| tx/tinker/alembic/__pycache__/ | ||
|
|
||
| # SQLite databases (tracked in git by default, but ignore if created locally) | ||
| *.db | ||
| *.db-journal | ||
| *.db-wal | ||
| *.db-shm |
There was a problem hiding this comment.
This block adds many standard Python .gitignore entries, which is great. However, there are a few duplicated entries that should be removed to keep the file clean:
uv.lockis defined on line 49, but also on lines 79 and 131.*.dbis defined on line 46, but also on line 137.
Please remove the redundant entries on lines 79, 131, and 137.
| megatron = [ | ||
| "skyrl[skyrl-train]; python_version == '3.12'", | ||
| "transformer-engine[pytorch]==2.10.0; sys_platform == 'linux' and python_version == '3.12'", | ||
| "flash-attn==2.8.1; sys_platform == 'linux' and python_version == '3.12'", | ||
| "vllm==0.13.0; sys_platform == 'linux' and python_version == '3.12'", | ||
| "torch==2.9.0; sys_platform == 'linux' and python_version == '3.12'", | ||
| "flashinfer-python==0.5.3; sys_platform == 'linux' and platform_machine == 'x86_64' and python_version == '3.12'", | ||
| "torchvision; sys_platform == 'linux' and python_version == '3.12'", | ||
| "megatron-bridge; sys_platform == 'linux' and python_version == '3.12'", | ||
| "megatron-core==0.15.0; sys_platform == 'linux' and python_version == '3.12'", | ||
| "flashinfer-jit-cache==0.5.3; sys_platform == 'linux' and platform_machine == 'x86_64' and python_version == '3.12'", | ||
| "nvidia-modelopt; sys_platform == 'linux' and python_version == '3.12'", | ||
| ] |
There was a problem hiding this comment.
Pinning the entire megatron extra to python_version == '3.12' is quite restrictive. It prevents users on other Python versions (like 3.11, which is supported by the project) from installing and using this functionality. The comment mentions this is due to ml-dtypes. Have you considered addressing the dependency conflict more granularly, for example by using an override for ml-dtypes in [tool.uv.override-dependencies]? This would make the project more flexible for users on different Python versions.
| [project.scripts] | ||
| # The following is for supporting the skyrl-train dependency |
There was a problem hiding this comment.
The [project.scripts] section is empty, and the accompanying comment "# The following is for supporting the skyrl-train dependency" is a bit unclear. If there are no scripts to be installed, it would be cleaner to remove this section entirely. If it's a placeholder for future use, a more descriptive comment explaining its purpose would be helpful for future maintainers.
Uh oh!
There was an error while loading. Please reload this page.