Skip to content

Optimize causal-conv1d/transformer engine installation in CI #1796

@akoumpa

Description

@akoumpa

Is your feature request related to a problem? Please describe.
In particular, for the python + cuda job
https://github.com/NVIDIA-NeMo/Automodel/actions/runs/24319912918/job/71004049370?pr=1769, installation of automodel[cuda] takes 20minutes, which slows down developers. I think it would be a good idea to have some cache of those packages where they are precompiled.
https://github.com/NVIDIA-NeMo/Automodel/actions/runs/24319912918/job/71004049370?pr=1769

Mon, 13 Apr 2026 00:21:45 GMT  Building wheels for collected packages: nemo-automodel, transformer_engine_torch, causal-conv1d, mamba-ssm, nv-grouped-gemm
Mon, 13 Apr 2026 00:21:45 GMT  Building wheel for nemo-automodel (pyproject.toml): started
Mon, 13 Apr 2026 00:21:46 GMT  Building wheel for nemo-automodel (pyproject.toml): finished with status 'done'
Mon, 13 Apr 2026 00:21:46 GMT  Created wheel for nemo-automodel: filename=nemo_automodel-0.4.0-py3-none-any.whl size=1088077 sha256=37ed4b3d7862d127d71cdbd8848f2412d8fad93dec54989a24e4631722c6f5fb
Mon, 13 Apr 2026 00:21:46 GMT  Stored in directory: /tmp/pip-ephem-wheel-cache-rknr4mfb/wheels/f7/73/44/d1794a81399023002424c5bb6c8f421dff2bde2442cb1e1b10
Mon, 13 Apr 2026 00:21:46 GMT  Building wheel for transformer_engine_torch (pyproject.toml): started
Mon, 13 Apr 2026 00:22:55 GMT  Building wheel for transformer_engine_torch (pyproject.toml): still running...
Mon, 13 Apr 2026 00:24:03 GMT  Building wheel for transformer_engine_torch (pyproject.toml): still running...
Mon, 13 Apr 2026 00:25:04 GMT  Building wheel for transformer_engine_torch (pyproject.toml): still running...
Mon, 13 Apr 2026 00:26:10 GMT  Building wheel for transformer_engine_torch (pyproject.toml): still running...
Mon, 13 Apr 2026 00:27:12 GMT  Building wheel for transformer_engine_torch (pyproject.toml): still running...
Mon, 13 Apr 2026 00:28:18 GMT  Building wheel for transformer_engine_torch (pyproject.toml): still running...
Mon, 13 Apr 2026 00:29:22 GMT  Building wheel for transformer_engine_torch (pyproject.toml): still running...
Mon, 13 Apr 2026 00:29:23 GMT  Building wheel for transformer_engine_torch (pyproject.toml): finished with status 'done'
Mon, 13 Apr 2026 00:29:23 GMT  Created wheel for transformer_engine_torch: filename=transformer_engine_torch-2.11.0-cp312-cp312-linux_x86_64.whl size=702275 sha256=26567b8eb6430774d6f82978a436ac926a695f6d66159d14e176824d502443ff
Mon, 13 Apr 2026 00:29:23 GMT  Stored in directory: /tmp/pip-ephem-wheel-cache-rknr4mfb/wheels/c6/2c/5c/6ec22e7d5f6268523b09224b2c8baaea63007249f00d2e3f3e
Mon, 13 Apr 2026 00:29:23 GMT  Building wheel for causal-conv1d (pyproject.toml): started
Mon, 13 Apr 2026 00:32:10 GMT  Building wheel for causal-conv1d (pyproject.toml): still running...
Mon, 13 Apr 2026 00:34:24 GMT  Building wheel for causal-conv1d (pyproject.toml): still running...
Mon, 13 Apr 2026 00:35:35 GMT  Building wheel for causal-conv1d (pyproject.toml): still running...
Mon, 13 Apr 2026 00:36:38 GMT  Building wheel for causal-conv1d (pyproject.toml): still running...
Mon, 13 Apr 2026 00:37:54 GMT  Building wheel for causal-conv1d (pyproject.toml): still running...
Mon, 13 Apr 2026 00:38:58 GMT  Building wheel for causal-conv1d (pyproject.toml): still running...
Mon, 13 Apr 2026 00:39:49 GMT  Building wheel for causal-conv1d (pyproject.toml): finished with status 'done'
Mon, 13 Apr 2026 00:39:49 GMT  Created wheel for causal-conv1d: filename=causal_conv1d-1.6.1-cp312-cp312-linux_x86_64.whl size=253650229 sha256=fff7600c79173705c36c1ea62a2ba9a1209b2ece7ddadd3de17ccba3e5ead5db
Mon, 13 Apr 2026 00:39:49 GMT  Stored in directory: /tmp/pip-ephem-wheel-cache-rknr4mfb/wheels/98/4a/75/b24971cff4599825b16b612f08fbd2e60a2c336a56e081a3c8
Mon, 13 Apr 2026 00:39:49 GMT  Building wheel for mamba-ssm (pyproject.toml): started
Mon, 13 Apr 2026 00:39:55 GMT  Building wheel for mamba-ssm (pyproject.toml): finished with status 'done'

so it spends about 17 minutes installating te + causal-conv1d , which most of it spend on compilation.

Describe the solution you'd like
Use pre-compiled packages / cache to get rid of the 17 min installation time.

Describe alternatives you've considered
N/A

Additional context
N/A

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions