Skip to content

JNR issue on Android phones #6147

@HongHuangNeu

Description

@HongHuangNeu

Have I written custom code (as opposed to using a stock example script provided in MediaPipe)

None

OS Platform and Distribution

Android 12,13

Mobile device if the issue happens on mobile device

Google Pixel 4a,Google Pixel 5,Samsung SM-A556E,Xiaomi Redmi Note 7,OPPO CPH2127

Browser and version if the issue happens on browser

No response

Programming Language and version

kotlin,C++

MediaPipe version

com.google.mediapipe:solution-core:0.10.20,com.google.mediapipe:facemesh:0.10.20

Bazel version

No response

Solution

FaceMesh V1. We also tried Face landmark Task API for landmark detection but with failure, and we decided to turn to V1. See the details below.

Android Studio, NDK, SDK versions (if issue is related to building in Android environment)

No response

Xcode & Tulsi version (if issue is related to building for iOS)

No response

Describe the actual behavior

Encounter JNR issue. This happens 4 to 8 times every 100 times of execution.

The shortest code snippet of face mesh call is as follows(For facemesh V1):

mFaceMesh = new FaceMesh(
    context, 
    FaceMeshOptions.builder()
        .setRunOnGpu(false)
        .setStaticImageMode(false)
        .setMaxNumFaces(1)
        .setRefineLandmarks(true)
        .setMinDetectionConfidence(0.75f)
        .build()
);

// Set result listener
mFaceMesh.setResultListener(faceMeshResult -> {
    // Save the latest landmark result
    mLastFaceMeshResult = faceMeshResult;
});

// Send to MediaPipe to do landmark calculation
mFaceMesh.send(bitmap, timestamp);

Describe the expected behaviour

Should not experience JNR

Standalone code/steps you may have used to try to get what you need

Face Landmarker:
Run the following demo on a pixel 4a phone with android 13 system
https://github.com/google-ai-edge/mediapipe-samples/tree/main/examples/face_landmarker/android
The app stuck after 2 to 4 minutes of execution. The processing speed is quite low(10 to 15 frames per second)

Face mesh V1:Will update it later

Other info / Complete Logs

The original stack trace file is too large to upload. I used cursor to summarize the stack trace as follows. This is the result for Facemesh V1.

# MediaPipe ANR Issue Summary

## Issue Overview

**Issue Type**: Application Not Responding (ANR)  
**Timestamp**: 2025-10-27 14:29:40  
**Application**: `com.example.Image`  
**Affected Activity**: `io.silvrr.base.alivedetectdemo.FunctionListActivity`  
**ANR Reason**: Application does not have a focused window

---

## Key Findings

### 1. Thread Blocking Pattern

**All MediaPipe worker threads (Thread-9 through Thread-16) are blocked at the same location:**


Thread-9 to Thread-16:
  native: #00 pc 0004fa70  /apex/com.android.runtime/lib64/bionic/libc.so (syscall+32)
  native: #01 pc 00c21d18  /data/app/.../lib/arm64/libmediapipe_jni.so (???)
  native: #02 pc 00c21da0  /data/app/.../lib/arm64/libmediapipe_jni.so (???)
  native: #03 pc 00c21ac8  /data/app/.../lib/arm64/libmediapipe_jni.so (???)
  native: #04 pc 00c213b0  /data/app/.../lib/arm64/libmediapipe_jni.so (???)
  native: #05 pc 00ba48c0  /data/app/.../lib/arm64/libmediapipe_jni.so (???)
  native: #06 pc 00ba452c  /data/app/.../lib/arm64/libmediapipe_jni.so (???)
  (no managed stack frames)


**Critical Observations:**
- ✅ All 8 threads blocked at **identical address** (`pc 00c21d18`) in `libmediapipe_jni.so`
- ✅ All threads waiting at `syscall+32` (likely `futex` system call for synchronization primitives)
- ✅ **No Java/Kotlin stack frames** - blocking occurs entirely in native (C++) layer
- ✅ Thread state: `state=S` (Sleeping/Blocked)
- ✅ Thread priority: `prio=10` (high priority, abnormal - normal MediaPipe threads should be `prio=5`)

### 2. Thread Count Anomaly

**Total Thread Count**: 154 threads  
**MediaPipe Worker Threads**: 8 threads (Thread-9 to Thread-16)  
**Expected Normal Count**: 2-4 MediaPipe worker threads

**Analysis:**
- Thread count is significantly higher than normal
- Multiple MediaPipe threads created, suggesting high-frequency input or resource contention

### 3. Main Thread State

**Main Thread**: Waiting for Vsync events but unable to process input


"main" prio=5 tid=1 Native
  state=S
  at android.os.MessageQueue.nativePollOnce(Native method)
  at android.os.MessageQueue.next(MessageQueue.java:335)


The main thread is blocked waiting for display events, but cannot respond to user input due to the underlying MediaPipe thread deadlock.

---

## Root Cause Analysis

### Problem Chain


High-frequency frame input
  ↓
MediaPipe creates excessive threads (8+ threads)
  ↓
Multiple threads simultaneously access shared resources (locks, mutexes, condition variables)
  ↓
Intense resource contention
  ↓
All threads wait for the same resource
  ↓
Deadlock/blocking formation
  ↓
Unable to return results to Java layer
  ↓
Main thread timeout → ANR


### Blocking Location

**Not JNI layer blocking**, but **MediaPipe native layer internal synchronization primitive blocking**:
- All threads blocked at the same address in `libmediapipe_jni.so`
- Blocking occurs in MediaPipe's internal synchronization mechanisms (mutex, condition variable, semaphore)
- The blocking is not at the Java-C communication boundary, but deep within MediaPipe's native implementation

### Resource Contention

**Evidence:**
1. 8 threads simultaneously blocked at identical address
2. All waiting on `syscall` (typically `futex` for lock mechanisms)
3. Thread count far exceeds normal values
4. High-priority threads (`prio=10`) exacerbate resource competition

**Possible Contended Resources:**
- Mutex locks (multiple threads trying to acquire the same lock)
- Condition variables (multiple threads waiting for the same condition)
- Semaphores (multiple threads waiting for the same semaphore)
- Shared memory/resources (model data, GPU context, memory pools)

---

## Impact

1. **User Experience**: Application becomes unresponsive, requires force close
2. **Business Impact**: Live detection workflow interrupted
3. **System Resources**: Excessive thread creation consumes system resources
4. **Performance**: High-priority threads compete for CPU, worsening the situation

---

## Technical Details

### Stack Trace Pattern

All MediaPipe threads show identical blocking pattern:
- **Location**: `libmediapipe_jni.so` at address `0x00c21d18`
- **System Call**: `syscall+32` (likely `futex` wait)
- **Thread State**: Sleeping/Blocked
- **No Java Frames**: Complete native-layer blocking

### Thread Characteristics

| Thread | Priority | State | Blocking Address |
|--------|----------|-------|------------------|
| Thread-9 | 10 (high) | S (Sleeping) | pc 00c21d18 |
| Thread-10 | 10 (high) | S (Sleeping) | pc 00c21d18 |
| Thread-11 | 10 (high) | S (Sleeping) | pc 00c21d18 |
| Thread-12 | 10 (high) | S (Sleeping) | pc 00c21d18 |
| Thread-13 | 10 (high) | S (Sleeping) | pc 00c21d18 |
| Thread-14 | 10 (high) | S (Sleeping) | pc 00c21d18 |
| Thread-15 | 10 (high) | S (Sleeping) | pc 00c21d18 |
| Thread-16 | 10 (high) | S (Sleeping) | pc 00c21d18 |

---

## Conclusion

The ANR is caused by **excessive thread creation leading to resource contention/deadlock** in MediaPipe's native layer. All MediaPipe worker threads are blocked waiting for the same synchronization primitive, preventing the application from processing frames and responding to user input.

**Key Indicators:**
- ✅ 8+ MediaPipe threads simultaneously blocked
- ✅ All threads waiting at identical address (resource contention)
- ✅ Thread count far exceeds normal (40+ threads total)
- ✅ Blocking at `syscall` level (waiting for locks/resources)
- ✅ High-priority threads exacerbate competition

**Recommended Solution**: Implement input throttling (e.g., Semaphore-based rate limiting) to control frame input frequency, indirectly reducing thread count and resource contention.

---

**Document Version**: v1.0  
**Created**: 2025-10-27  
**Based on**: ANR dumpstate analysis from `dumpstate-2025-10-27-14-37-46.txt`

We are running Facemesh V1 on android on a face recognition app to generate face landmarks. We are not using Task API because facemesh V2 is too slow on android,and we encounter similar JNR issue with higher frequency with V2. Are there any official way where we can use lighter face landmark models while adopting the latest API and achieve reasonable reliability? If we stick to face mesh V1(our solution requires a high rate of processing frames, approximately 20 to 30 frames per second), what is the recommended way to implement face landmarks?

If we have to calculate face landmarks in Task API using V2 model, is it possible to achieve 30 frames per second with reasonable stability? What modifications do we need to make to the following demo?
https://github.com/google-ai-edge/mediapipe-samples/tree/main/examples/face_landmarker/android
We tried this demo before but the speed is pretty slow(10 to 15 frames per second and the app freeze after 2 to 4 minutes on a pixel 4a phone with android13 system ). That is the reason why we turn to facemesh V1

Metadata

Metadata

Assignees

Labels

type:bugBug in the Source Code of MediaPipe Solutiontype:performanceExecution Time and memory heap, stackoverflow and garbage collection related

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions