Skip to content

fix: release accelerator model refs during cleanup#1321

Merged
kcz358 merged 1 commit into
EvolvingLMMs-Lab:mainfrom
xk-huang:fix/accelerator-cleanup-releases-models
May 6, 2026
Merged

fix: release accelerator model refs during cleanup#1321
kcz358 merged 1 commit into
EvolvingLMMs-Lab:mainfrom
xk-huang:fix/accelerator-cleanup-releases-models

Conversation

@xk-huang
Copy link
Copy Markdown
Contributor

@xk-huang xk-huang commented May 2, 2026

Summary

  • Release Accelerate-held model references during lmms.clean() by calling Accelerator.free_memory() when an accelerator is attached.
  • Fixes a post-inference GPU memory retention path where prepared models remain referenced by Accelerate even after direct nn.Module attributes are deleted.
  • Add regression coverage for accelerator-retained model references.

In scope

  • Update lmms_eval/api/model.py so base model cleanup clears Accelerate internal model references before existing module deletion, garbage collection, and CUDA cache clearing.
  • Add test/models/test_model_cleanup.py to verify cleanup calls free_memory() and removes direct model attributes.

Out of scope

  • Does not change model generation, postprocessing, distributed gather behavior, sample logging, or task metrics.
  • Does not add model-specific cleanup logic for llava_onevision1_5; the fix applies through the shared base cleanup path.

Validation

  • python -m compileall -q lmms_eval/api/model.py test/models/test_model_cleanup.py | sample size: N=2 files | key metrics: syntax/import bytecode compilation | result: pass
  • python -m pytest -q test/models/test_model_cleanup.py | sample size: N=1 test | key metrics: accelerator model references released during clean() | result: pass (1 passed in 3.47s)

Risk / Compatibility

  • Low risk: this only runs during explicit model cleanup after inference; normal model execution is unchanged.
  • For Accelerate-backed models, free_memory() clears Accelerate internal references before the existing direct module cleanup and CUDA cache clearing, reducing rank memory pressure before postprocessing/gathering.

Type of Change

  • Bug fix (non-breaking change)
  • New feature
  • New benchmark/task
  • New model integration
  • Breaking change
  • Documentation update
  • Refactoring (no functional changes)

@xk-huang xk-huang changed the title Release accelerator model references during cleanup fix: release accelerator model refs during cleanup May 2, 2026
@kcz358 kcz358 merged commit 3592b72 into EvolvingLMMs-Lab:main May 6, 2026
2 of 3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants