-
Notifications
You must be signed in to change notification settings - Fork 238
Open
Labels
type:bugBugBugtype:gpu delegateIssue with GPU delegationIssue with GPU delegationtype:macOSFor MacOS related issuesFor MacOS related issuestype:memoryAn issue with memory, memory performance, or memory leaksAn issue with memory, memory performance, or memory leaks
Description
Environment
Computer: MacBook Pro M1 Pro
macOS version: 26.2 (Tahoe)
Python version: 3.11
LiteRT version: 2.1.2
Issue
Export Torchvision LRASPP MobileNet V3 Large model to TFLite:
import litert_torch
import torch
from torchvision.models.segmentation import lraspp_mobilenet_v3_large
model = lraspp_mobilenet_v3_large().eval()
sample_inputs = (torch.randn(1, 3, 520, 520),)
edge_model = litert_torch.convert(model, sample_inputs)
edge_model.export("lraspp_mobilenet_v3_large.tflite")Then perform inference using GPU acceleration:
import numpy as np
from ai_edge_litert.compiled_model import CompiledModel, HardwareAccelerator
from tqdm import tqdm
model = CompiledModel.from_file(
"lraspp_mobilenet_v3_large.tflite",
HardwareAccelerator.GPU | HardwareAccelerator.CPU,
)
input_data = np.random.rand(1, 3, 520, 520).astype(np.float32)
input_buffers = model.create_input_buffers(0)
output_buffers = model.create_output_buffers(0)
for _ in tqdm(range(10_000)):
input_buffers[0].write(input_data)
model.run_by_index(0, input_buffers, output_buffers)
output_buffers[0].read(21 * 520 * 520, np.float32)This results in RAM rising to very high values at a steady pace.
I wasn't able to reproduce the issue on an x86_64 Ubuntu 24 with Nvidia RTX 3070 environment under the same conditions. The issue does not seem to arise when using CPU acceleration only (or very slowly such as it is not noticeable). The issue does not arise when removing the output buffer read.
Logs
INFO: [environment.cc:29] Creating LiteRT environment with options
WARNING: [auto_registration.cc:71] NPU accelerator could not be loaded and registered: kLiteRtStatusErrorInvalidArgument.
INFO: [auto_registration.cc:150] Loading GPU accelerator(/Users/louislac/Downloads/google-ai-edge/.venv/lib/python3.11/site-packages/ai_edge_litert/libLiteRtGpuAccelerator.dylib).
INFO: [auto_registration.cc:150] Loading GPU accelerator(/Users/louislac/Downloads/google-ai-edge/.venv/lib/python3.11/site-packages/ai_edge_litert/libLiteRtWebGpuAccelerator.dylib).
INFO: [auto_registration.cc:150] Loading GPU accelerator(/Users/louislac/Downloads/google-ai-edge/.venv/lib/python3.11/site-packages/ai_edge_litert/libLiteRtMetalAccelerator.dylib).
INFO: [accelerator_registry.cc:52] RegisterAccelerator: ptr=0x7988005a0, name=GPU Metal
INFO: [auto_registration.cc:158] Dynamically loaded GPU accelerator(libLiteRtMetalAccelerator.dylib) registered.
INFO: [accelerator_registry.cc:52] RegisterAccelerator: ptr=0x798819a40, name=CpuAccelerator
INFO: [auto_registration.cc:183] CPU accelerator registered.
INFO: [compiled_model.cc:485] Flatbuffer model initialized directly from incoming litert model.
I0000 00:00:1770661993.173968 11386794 delegate_metal.mm:83] Created a Metal device.
INFO: [environment.cc:40] Adding options to the existing LiteRT environment
INFO: [gpu_environment.cc:370] Failed to create OpenCL context.
INFO: [gpu_environment.cc:377] Created Metal device from provided device id
INFO: [gpu_environment.h:150] Created LiteRT GpuEnvironment.
I0000 00:00:1770661993.182718 11386794 delegate_kernel.cc:668] Initializing Metal-based API from graph.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
type:bugBugBugtype:gpu delegateIssue with GPU delegationIssue with GPU delegationtype:macOSFor MacOS related issuesFor MacOS related issuestype:memoryAn issue with memory, memory performance, or memory leaksAn issue with memory, memory performance, or memory leaks