Skip to content

build: move flash-linear-attention back to optional-dependencies#1894

Open
zpqiu wants to merge 4 commits intomainfrom
zpqiu/fla-back-to-optional-dep
Open

build: move flash-linear-attention back to optional-dependencies#1894
zpqiu wants to merge 4 commits intomainfrom
zpqiu/fla-back-to-optional-dep

Conversation

@zpqiu
Copy link
Copy Markdown
Contributor

@zpqiu zpqiu commented Apr 17, 2026

What does this PR do ?

Move flash-linear-attention from Automodel's [dependency-groups].dev (PEP 735) back into a dedicated fla [project.optional-dependencies] extra so that downstream consumers (e.g. NeMo-RL) can resolve it transitively via nemo-automodel[moe] / [all] — PEP 735 dependency-groups cannot be referenced via the pkg[extra] syntax, which silently broke fla for consumers after #1580.

Changelog

  • Add new fla [project.optional-dependencies] extra pinning flash-linear-attention>=0.4.2 (dropping the VCS SHA pin now that the official 0.4.2 release includes fla.ops.cp).
  • Compose nemo_automodel[fla] into the moe and all extras so downstream consumers pick fla up transitively.
  • Remove flash-linear-attention (and the stale "pin to main until a release includes it" comment) from [dependency-groups].dev.
  • Refresh uv.lock to reflect the new resolution.

Before your PR is "Ready for review"

Pre checks:

  • Make sure you read and followed Contributor guidelines
  • Did you write any new necessary tests? — N/A, packaging-only change.
  • Did you add or update any necessary documentation? — N/A, no user-facing behavior change; the in-file comment near the new `fla` extra has been refreshed to reflect the 0.4.2 version gate.

Move flash-linear-attention from the [dependency-groups].dev group
(PEP 735, resolvable only by this repo's own `uv sync`) into a dedicated
`fla` [project.optional-dependencies] extra, and compose it into the
`moe` and `all` extras. Downstream consumers (e.g. NeMo-RL) pull
nemo-automodel via the `pkg[extra]` syntax, which cannot reference
PEP 735 dependency-groups, so after #1580 `fla` was silently missing
for them.

Also drop the VCS SHA pin in favor of the official 0.4.2 release, now
that `fla.ops.cp` (required for CP on linear attention) is included
upstream.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Zhaopeng Qiu <alexq@nvidia.com>
@zpqiu zpqiu requested a review from a team as a code owner April 17, 2026 14:53
@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot bot commented Apr 17, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

Signed-off-by: NeMo Bot <nemo-bot@nvidia.com>
@zpqiu zpqiu requested review from akoumpa and thomasdhc April 17, 2026 14:57
@zpqiu
Copy link
Copy Markdown
Contributor Author

zpqiu commented Apr 17, 2026

/ok to test c27163b

@zpqiu
Copy link
Copy Markdown
Contributor Author

zpqiu commented Apr 17, 2026

CI failures on L0_Unit_Tests_CPU / L0_Unit_Tests_GPU are pre-existing on main. This PR only touches pyproject.toml and is unrelated to the broken tests.

@akoumpa
Copy link
Copy Markdown
Contributor

akoumpa commented Apr 17, 2026

Hi @zpqiu , I'm preparing a fix #1898 . Will merge & trigger ci once it's done. I apologize for the trouble.

@akoumpa
Copy link
Copy Markdown
Contributor

akoumpa commented Apr 17, 2026

/ok to test 046aa02

@akoumpa
Copy link
Copy Markdown
Contributor

akoumpa commented Apr 18, 2026

@NVIDIA-NeMo/automation PTAL, thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants