-
Notifications
You must be signed in to change notification settings - Fork 238
Description
Context
I've been tracing through the LiteRT build system to understand how GPU acceleration works. I noticed that the CPU (XNNPACK) and NPU (dispatch) accelerator source code is available in the repository, but the GPU accelerator appears to only be distributed as prebuilt .so files.
What I found
CPU and NPU accelerators have full source:
litert/runtime/accelerators/
├── xnnpack/ ← Source available ✅
├── dispatch/ ← Source available (NPU) ✅
└── (no gpu/ folder) ← Not present ❌
GPU is only available as prebuilts:
# litert/build_common/special_rule.bzl
def litert_gpu_accelerator_deps():
return [] # Empty — no source-level deps
def litert_gpu_accelerator_prebuilts():
return select({
"android_arm64": ["@litert_prebuilts//:android_arm64/libLiteRtClGlAccelerator.so"],
"macos_arm64": ["@litert_prebuilts//:macos_arm64/libLiteRtMetalAccelerator.dylib"],
# ...
})Evidence that internal source exists but isn't published:
-
auto_registration.cc(line 40-41) defines a function pointer specifically for static GPU linking, currently set tonullptr:extern "C" LiteRtStatus (*LiteRtRegisterStaticLinkedAcceleratorGpu)( LiteRtEnvironmentT& environment) = nullptr;
-
litert/tools/BUILDreferences internal GPU accelerator targets behindcopybara:uncomment_begin(google-only):# "//litert/runtime/accelerators/gpu:ml_drift_cl_gl_accelerator", # "//litert/runtime/accelerators/gpu:ml_drift_vulkan_accelerator",
-
The
gpu_numerics_check_cl_gl,gpu_numerics_check_vulkan, andgpu_numerics_check_jetbuild targets all reference these internal-only GPU accelerator libraries.
Problem with using prebuilts
The prebuilt libLiteRtGpuAccelerator.so dynamically links against libLiteRt.so. So even when using run_model (which already has the entire LiteRT runtime statically linked via litert/cc:litert_compiled_model → litert/runtime:compiled_model), I still need to push libLiteRt.so alongside the GPU .so for GPU to work. CPU works without any .so files.
# GPU requires both .so files even though run_model has LiteRT statically:
adb push run_model /data/local/tmp/
adb push libLiteRtGpuAccelerator.so /data/local/tmp/
adb push libLiteRt.so /data/local/tmp/ # Required for GPU to initializeMy use case
I'm modifying GPU delegate kernels under tflite/delegates/gpu/cl/kernels/ and need to test them through LiteRT's accelerator path (to use TensorBuffer zero-copy). The GPU kernel code itself is already open-source — what's missing is only the thin LiteRT accelerator wrapper (equivalent to what xnnpack_accelerator.cc does for CPU) that calls TfLiteGpuDelegateV2Create() and registers with LiteRT's environment.
Without the accelerator wrapper source, the only options are:
- Use prebuilt
.so(cannot pick up kernel modifications) - Use TFLite directly like
culprit_finderdoes (loses LiteRTTensorBuffer/ zero-copy benefits) - Write a custom accelerator wrapper from scratch by reverse-engineering the XNNPACK pattern
Questions
- Is there a plan to open-source the GPU accelerator wrapper code (
litert/runtime/accelerators/gpu/)? - If not, is the recommended approach to write a custom wrapper following the XNNPACK accelerator pattern?