Skip to content

Memory leak with macOS Metal GPU acceleration #5706

@laclouis5

Description

@laclouis5

Environment

Computer: MacBook Pro M1 Pro
macOS version: 26.2 (Tahoe)
Python version: 3.11
LiteRT version: 2.1.2

Issue

Export Torchvision LRASPP MobileNet V3 Large model to TFLite:

import litert_torch
import torch
from torchvision.models.segmentation import lraspp_mobilenet_v3_large

model = lraspp_mobilenet_v3_large().eval()
sample_inputs = (torch.randn(1, 3, 520, 520),)

edge_model = litert_torch.convert(model, sample_inputs)
edge_model.export("lraspp_mobilenet_v3_large.tflite")

Then perform inference using GPU acceleration:

import numpy as np
from ai_edge_litert.compiled_model import CompiledModel, HardwareAccelerator
from tqdm import tqdm

model = CompiledModel.from_file(
    "lraspp_mobilenet_v3_large.tflite",
    HardwareAccelerator.GPU | HardwareAccelerator.CPU,
)

input_data = np.random.rand(1, 3, 520, 520).astype(np.float32)
input_buffers = model.create_input_buffers(0)
output_buffers = model.create_output_buffers(0)

for _ in tqdm(range(10_000)):
    input_buffers[0].write(input_data)
    model.run_by_index(0, input_buffers, output_buffers)
    output_buffers[0].read(21 * 520 * 520, np.float32)

This results in RAM rising to very high values at a steady pace.

I wasn't able to reproduce the issue on an x86_64 Ubuntu 24 with Nvidia RTX 3070 environment under the same conditions. The issue does not seem to arise when using CPU acceleration only (or very slowly such as it is not noticeable). The issue does not arise when removing the output buffer read.

Logs

INFO: [environment.cc:29] Creating LiteRT environment with options
WARNING: [auto_registration.cc:71] NPU accelerator could not be loaded and registered: kLiteRtStatusErrorInvalidArgument.
INFO: [auto_registration.cc:150] Loading GPU accelerator(/Users/louislac/Downloads/google-ai-edge/.venv/lib/python3.11/site-packages/ai_edge_litert/libLiteRtGpuAccelerator.dylib).
INFO: [auto_registration.cc:150] Loading GPU accelerator(/Users/louislac/Downloads/google-ai-edge/.venv/lib/python3.11/site-packages/ai_edge_litert/libLiteRtWebGpuAccelerator.dylib).
INFO: [auto_registration.cc:150] Loading GPU accelerator(/Users/louislac/Downloads/google-ai-edge/.venv/lib/python3.11/site-packages/ai_edge_litert/libLiteRtMetalAccelerator.dylib).
INFO: [accelerator_registry.cc:52] RegisterAccelerator: ptr=0x7988005a0, name=GPU Metal
INFO: [auto_registration.cc:158] Dynamically loaded GPU accelerator(libLiteRtMetalAccelerator.dylib) registered.
INFO: [accelerator_registry.cc:52] RegisterAccelerator: ptr=0x798819a40, name=CpuAccelerator
INFO: [auto_registration.cc:183] CPU accelerator registered.
INFO: [compiled_model.cc:485] Flatbuffer model initialized directly from incoming litert model.
I0000 00:00:1770661993.173968 11386794 delegate_metal.mm:83] Created a Metal device.
INFO: [environment.cc:40] Adding options to the existing LiteRT environment
INFO: [gpu_environment.cc:370] Failed to create OpenCL context.
INFO: [gpu_environment.cc:377] Created Metal device from provided device id
INFO: [gpu_environment.h:150] Created LiteRT GpuEnvironment.
I0000 00:00:1770661993.182718 11386794 delegate_kernel.cc:668] Initializing Metal-based API from graph.

Metadata

Metadata

Assignees

Labels

type:bugBugtype:gpu delegateIssue with GPU delegationtype:macOSFor MacOS related issuestype:memoryAn issue with memory, memory performance, or memory leaks

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions