Infinite memory accumulation over prolonged inference times

I haven't quite figured out how yet, but it appears that there's a memory leak occurring in this library somewhere, or at least in the ONNX runtime. I've tested on multiple models so far, and the result is always the same: The program eventually accumulates enough memory to OOM, despite my efforts to batch inferences and not hold more data than necessary in memory.

My pipeline is as follows:

```rust
fn create_model(tokenizer_path: &PathBuf, model_path: &PathBuf) -> GlinerResult<GLiNER<SpanMode>> {
    let model = GLiNER::<SpanMode>::new(
        Parameters::default(),
        RuntimeParameters::default().with_threads(NUM_THREADS),
        tokenizer_path,
        model_path,
    )?;
    Ok(model)
}
```

Note that I have not tested this on tokenizer mode.

I'm simply reading a CSV in batches (not using the built-in CSV `TextInput` functionality) and creating the `TextInput` from it:

```rust
let input = TextInput::new(text_batch.clone(), vec!["us_organizations".to_string()])?;
```

I should note that my inputs are relatively large at first, and I have batches of 1,000 inputs. However I do split inputs based on their token count so that nothing more than 512 tokens gets sent to the model per input in the among the 1,000. However, I have 3,000,000 records in total. Perhaps it's not noticeable on smaller datasets, but since this pipeline runs fairly long, the memory accumulation is very clear. Each batch inferenced accumulates more and more memory. 

Here is a profile of the memory during an execution (note this only processes one batch, so 1,000 records)

<img width="1726" height="657" alt="Image" src="https://github.com/user-attachments/assets/3fc67182-3584-4c37-a224-1108b3062b3a" />

Am I perhaps using the library wrong, or is there a bug happening?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Infinite memory accumulation over prolonged inference times #12

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Infinite memory accumulation over prolonged inference times #12

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions