Reducing Latency for Hand-Tracking Solution in Python

### Have I written custom code (as opposed to using a stock example script provided in MediaPipe)

Yes

### OS Platform and Distribution

Windows 10 amd64x

### MediaPipe Tasks SDK version

Mediapipe Version: 0.10.20

### Task name (e.g. Image classification, Gesture recognition etc.)

Hand landmark detection

### Programming Language and version (e.g. C++, Python, Java)

Python

### Describe the actual behavior

Using the LITE Model (model_complexity=0), I'm measuring Latency of 35-27ms 

### Describe the expected behaviour

According to "https://ai.google.dev/edge/mediapipe/solutions/vision/hand_landmarker", the FULL Model (model_complexity=1)  has a latency of CPU:17ms, GPU:12ms

### Standalone code/steps you may have used to try to get what you need

The way I measure latency is as follows:


```
def log_latency(start_time, event):
    elapsed_time = time.time() - start_time
    print(f"[{event}] Elapsed Time: {elapsed_time:.4f} seconds")
    return elapsed_time
    
with mp_hands.Hands(model_complexity=0, min_detection_confidence=0.3, min_tracking_confidence=0.5) as hands: 
    while cap.isOpened():
        ret, frame = cap.read()
        
        # BGR 2 RGB
        image = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
        
        # Flip on horizontal
        image = cv2.flip(image, 1)
        
        # Set flag
        image.flags.writeable = False

        # Detections
        det_time = time.time()
        results = hands.process(image)
        log_latency(det_time, "Landmark Detection")

        # Set flag to true
        image.flags.writeable = True
        
        # RGB 2 BGR
        image = cv2.cvtColor(image, cv2.COLOR_RGB2BGR)

        # Detections
        'print(results)'
        
        # Rendering results
        if results.multi_hand_landmarks:
            for num, hand in enumerate(results.multi_hand_landmarks):
                mp_drawing.draw_landmarks(image, hand, mp_hands.HAND_CONNECTIONS, 
                                        mp_drawing.DrawingSpec(color=(51, 102, 0), thickness=1, circle_radius=2),
                                        mp_drawing.DrawingSpec(color=(33, 165, 205), thickness=2, circle_radius=0),
                                         )
                    
            # Draw Finger distances to image from point list
            draw_tip_distances(image, results, point_list)
            
            # Draw Hand distances to image from tip list
            draw_hand_distances(num, image, results, tip_list)

        # Check for countdown trigger
        key = cv2.waitKey(10) & 0xFF
        if key == ord('c'):
            cal_distances(image, hands)

        # Show the image
        cv2.imshow('Hand Tracking', image)

        if key == ord('q'):
            break

cap.release()
cv2.destroyAllWindows()

```


### Other info / Complete Logs

I'm trying to get the Hand Tracking as close to real-time as possible.
I apologize if I'm misinterpreting the "expected" latency values.

I'm considering:
 *Hardware acceleration: Unfortunately, I don't have a CUDA GPU. Isn't supported in the Python solution as far as I know anyways.
 *Playing with detection & tracking confidence yielded improvements of ~2ms
 *Tracking only necessary landmarks: For my task I only need wrist & Fingertip landmarks, however the model tracks all 21 landmarks. Would creating a custom model like this be possible/reduce latency?
 *I considered switching to C++, but had problems setting up the MediaPipe Framework. I would get the Hello World to run successfully, but the hand_tracking_cpu example failed to build....
 
 I'll gladly specify further if necessary! Thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Reducing Latency for Hand-Tracking Solution in Python #5789

Have I written custom code (as opposed to using a stock example script provided in MediaPipe)

OS Platform and Distribution

MediaPipe Tasks SDK version

Task name (e.g. Image classification, Gesture recognition etc.)

Programming Language and version (e.g. C++, Python, Java)

Describe the actual behavior

Describe the expected behaviour

Standalone code/steps you may have used to try to get what you need

Other info / Complete Logs

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Reducing Latency for Hand-Tracking Solution in Python #5789

Description

Have I written custom code (as opposed to using a stock example script provided in MediaPipe)

OS Platform and Distribution

MediaPipe Tasks SDK version

Task name (e.g. Image classification, Gesture recognition etc.)

Programming Language and version (e.g. C++, Python, Java)

Describe the actual behavior

Describe the expected behaviour

Standalone code/steps you may have used to try to get what you need

Other info / Complete Logs

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions