This is the repository reference for person detection plus distance monitoring. Use it when you need person-to-person spacing logic on top of stereo-aligned detections.
- You need social-distance or proximity monitoring.
- You want a bird’s-eye view derived from stereo depth.
- You need a person-specific spatial detection baseline rather than a generic detector.
- You need general object spatial detections.
- You need hand-to-object safety logic instead of person-to-person distance.
- You need a pure RGB detector with no stereo branch.
Category:neural-networks/object-detection/social-distancingShape:script+standalonePrimary task:detect people and monitor their pairwise distanceEntrypoint:main.pyStandalone path:oakapp.tomlFrontend:noneRuns on:devices withCAM_A,CAM_B, andCAM_C; RVC2 peripheral, RVC4 peripheral, and RVC4 standalone packagingRequires:SCRFD person detector, stereo depth, and calibrationInput:live color and stereo pairOutput:Video,Detections,Distances, andBird-eye viewModels:SCRFD person-detection YAMLs in depthai_models/Visualizer / UI:DepthAI Visualizer viadai.RemoteConnection
- README.md
- main.py
- utils/measure_object_distance.py
- utils/host_social_distancing.py
- utils/host_bird_eye_view.py
- A color camera on
CAM_Afeeds the person detector. StereoDepthonCAM_B/Caligns depth back to the RGB stream.DepthMergerattaches depth to the 2D person detections.- Host nodes derive pairwise distances and a bird’s-eye view from those spatial detections.
- The example requires three cameras and aligned stereo depth.
- It is person-specific; it is not a general spatial-object monitor.
- Distance logic quality depends on reliable depth for each detected person.
- neural-networks/object-detection/human-machine-safety: use this when you need palm-to-object safety logic
- neural-networks/object-detection/spatial-detections: use this when you need the general spatial-detection baseline
- neural-networks/counting/people-counter: use this when you only need a current-frame person count
Run:python3 main.pySuccess looks like:the Visualizer shows people detections, distance annotations, and a bird’s-eye view that updates with scene geometryCommon failure meaning:stereo alignment is poor, the device lacks the required camera set, or the scene does not produce stable person detections