Flux2TEModel_ GGUF appears to be offloaded to RAM instead of fully released, causing OOM on low-RAM systems

### Custom Node Testing

- [ ] I have tried disabling custom nodes and the issue persists (see [how to disable custom nodes](https://docs.comfy.org/troubleshooting/custom-node-issues#step-1%3A-test-with-all-custom-nodes-disabled) if you need help)

### Your question

Title: Flux2TEModel_ GGUF appears to be offloaded to RAM instead of fully released, causing OOM on low-RAM systems

Environment:

- ComfyUI v0.24
- Google Colab
- Tesla T4 (15GB VRAM)
- ~13GB system RAM
- Flux2 Klein 9B GGUF Q8
- Flux2TEModel_ GGUF text encoder (Qwen 3 8B GUFF Q8)

Observed behavior:

After text encoding, cleanup/unload nodes are executed.

ComfyUI reports:

- Flux2TEModel_ loaded (~10GB)
- Partial unload from VRAM
- Flux UNet then loads (~9.7GB)

In many runs, the text encoder seems to be moved from VRAM into system RAM rather than being fully released.

RAM usage often reaches 90-96%.

Two outcomes are observed:

1. RAM pressure cache eventually frees memory and RAM drops back to ~30-40%, workflow completes successfully.
2. RAM never drops and the process OOMs while loading the UNet.

Additional notes:

- Reproduced with multiple unload/cleanup nodes.
- ComfyUI logs indicate RAM pressure cache is active.
- The issue seems related to model retention/offloading rather than pure VRAM exhaustion.

Question:

Is there a supported mechanism to force complete disposal of a GGUF text encoder instead of CPU offloading?

Are there known cases where RAM pressure cache retains GGUF models or Flux2TEModel_ references longer than expected?

Any recommended debugging steps to determine which object/reference prevents memory reclamation?

### Logs

```powershell

```

### Other

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Flux2TEModel_ GGUF appears to be offloaded to RAM instead of fully released, causing OOM on low-RAM systems #14433

Custom Node Testing

Your question

Logs

Other

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Flux2TEModel_ GGUF appears to be offloaded to RAM instead of fully released, causing OOM on low-RAM systems #14433

Description

Custom Node Testing

Your question

Logs

Other

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions