Skip to content

Run tflite on npu backend of Qualcomm device failed. #6059

@jixiedaima

Description

@jixiedaima
  1. build cmd:
    bazel clean --expunge
    bazel build -c opt --config=android_arm64 --define=xnn_enable_arm_sme=true --define=xnn_enable_arm_sme2=true //litert/c:libLiteRt.so
    bazel build -c opt --config=android_arm64 --define=xnn_enable_arm_sme=true --define=xnn_enable_arm_sme2=true //litert/vendors/qualcomm/dispatch:libLiteRtDispatch_Qualcomm.so

  2. test cmd:
    dada:/data/local/tmp $ ./benchmark_model --graph=speechbrain_sepformer_wham_decoder.tflite --use_npu=true --dispatch_library_path=. --compiler_plugin_library_path=.
    INFO: STARTING!
    INFO: Log parameter values verbosely: [0]
    INFO: [benchmark_litert_model.cc:201] Loading model from: speechbrain_sepformer_wham_decoder.tflite
    INFO: [benchmark_litert_model.cc:171] dispatch_library_path: .
    INFO: [benchmark_litert_model.cc:175] compiler_plugin_library_path: .
    INFO: [benchmark_litert_model.cc:178] compiler_cache_path:
    INFO: [environment.cc:30] Creating LiteRT environment with options
    INFO: [accelerator_registry.cc:52] RegisterAccelerator: ptr=0xb400007e10c54120, name=NpuAccelerator
    INFO: [auto_registration.cc:160] NPU accelerator registered.
    INFO: [auto_registration.cc:243] Loading GPU accelerator(libLiteRtGpuAccelerator.so).
    INFO: [auto_registration.cc:243] Loading GPU accelerator(libLiteRtClGlAccelerator.so).
    INFO: [auto_registration.cc:243] Loading GPU accelerator(libLiteRtOpenClAccelerator.so).
    INFO: [auto_registration.cc:243] Loading GPU accelerator(libLiteRtWebGpuAccelerator.so).
    INFO: [auto_registration.cc:243] Loading GPU accelerator(libLiteRtVulkanAccelerator.so).
    WARNING: [auto_registration.cc:272] GPU accelerator could not be loaded and registered.
    INFO: [accelerator_registry.cc:52] RegisterAccelerator: ptr=0xb400007e10c534e0, name=CpuAccelerator
    INFO: [auto_registration.cc:283] CPU accelerator registered.
    INFO: [compiled_model.cc:562] NPU JIT compilation caching enabled with cache dir:
    INFO: [compiled_model.cc:577] Applying compiler plugins...
    WARNING: [compiled_model.cc:585] Failed to apply compiler plugins: No compiler plugin found
    INFO: [compiled_model.cc:650] Flatbuffer model initialized directly from incoming litert model.
    INFO: Initialized TensorFlow Lite runtime.
    INFO: [litert_dispatch.cc:131] Loading shared library: ./libLiteRtDispatch_Qualcomm.so
    INFO: [common.h:131]
    ::qnn::Options:
    LogLevel: 0
    BackendType: 2
    Profiling: 0
    UseHtpPreference: false
    UseQint16AsQuint16: false
    EnableWeightSharing: false
    UseConvHMX: true
    UseFoldReLU: false
    HtpPerformanceMode: 2
    DumpTensorIds:
    IrJsonDir:
    DlcDir:
    VtcmSize: 0
    HvxThread: 0
    OptimizationLevel: 0
    GraphPriority: 0
    SaverOutputDir:

INFO: [qnn_manager.cc:372] Adding shared library dir to path: .
INFO: [dynamic_loading.cc:143] Adding /data/local/tmp:/data/local/tmp:/data/local/tmp::. to LD_LIBRARY_PATH
INFO: [qnn_manager.cc:121] Loading qnn shared library from "libQnnHtp.so"
INFO: [qnn_manager.cc:130] Loaded qnn shared library
ERROR: [dispatch_delegate.cc:114] Failed to initialize Dispatch API: ERROR: [litert/runtime/dispatch/dispatch_delegate.cc:179]

  1. File folder:
    FastVLM-0.5B.qualcomm.sm8750.litertlm libQnnCpu.so libQnnHta.so libQnnHtpV73Stub.so libQnnLpaiProfilingReader.so libSnpeHtpV68CalculatorStub.so libcalculator.so
    Gemma3-1B-IT_q4_ekv1280_sm8750.litertlm libQnnCpuNetRunExtensions.so libQnnHtaNetRunExtensions.so libQnnHtpV75CalculatorStub.so libQnnLpaiStub.so libSnpeHtpV68Stub.so libhta_hexagon_runtime_qnn.so
    aux.tflite libQnnDsp.so libQnnHtp.so libQnnHtpV75Stub.so libQnnModelDlc.so libSnpeHtpV69CalculatorStub.so libhta_hexagon_runtime_snpe.so
    benchmark_model libQnnDspNetRunExtensions.so libQnnHtpNetRunExtensions.so libQnnHtpV79CalculatorStub.so libQnnNetRunDirectV79Stub.so libSnpeHtpV69Stub.so litert_lm_main
    embedder.tflite libQnnDspV66CalculatorStub.so libQnnHtpOptraceProfilingReader.so libQnnHtpV79Skel.so libQnnNetRunDirectV81Stub.so libSnpeHtpV73CalculatorStub.so litertlm_export_main
    external_weight_loader_test libQnnDspV66Stub.so libQnnHtpPrepare.so libQnnHtpV79Stub.so libQnnSaver.so libSnpeHtpV73Stub.so prefill_decode.tflite
    libGemmaModelConstraintProvider.so libQnnGenAiTransformer.so libQnnHtpProfilingReader.so libQnnHtpV81CalculatorStub.so libQnnSystem.so libSnpeHtpV75CalculatorStub.so run_model
    libGenie.so libQnnGenAiTransformerCpuOpPkg.so libQnnHtpV68CalculatorStub.so libQnnHtpV81Stub.so libQnnTFLiteDelegate.so libSnpeHtpV75Stub.so scoped.bin
    libLiteRt.so libQnnGenAiTransformerModel.so libQnnHtpV68Stub.so libQnnIr.so libSNPE.so libSnpeHtpV79CalculatorStub.so speechbrain_sepformer_wham_decoder.tflite
    libLiteRtDispatch_Qualcomm.so libQnnGpu.so libQnnHtpV69CalculatorStub.so libQnnJsonProfilingReader.so libSnpeDspV66Stub.so libSnpeHtpV79Stub.so speechbrain_sepformer_wham_decoder_in_0.bin
    libPlatformValidatorShared.so libQnnGpuNetRunExtensions.so libQnnHtpV69Stub.so libQnnLpai.so libSnpeHta.so libSnpeHtpV81CalculatorStub.so speechbrain_sepformer_wham_decoder_out_0.bin
    libQnnChrometraceProfilingReader.so libQnnGpuProfilingReader.so libQnnHtpV73CalculatorStub.so libQnnLpaiNetRunExtensions.so libSnpeHtpPrepare.so libSnpeHtpV81Stub.so weights.bin

But I can run Gemma3-1B-IT_q4_ekv1280_sm8750.litertlm on 8gen4 device. Could you help me?

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions