-
-
Notifications
You must be signed in to change notification settings - Fork 2.2k
MemryX MX3 detector integration #17723
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dev
Are you sure you want to change the base?
MemryX MX3 detector integration #17723
Conversation
memryx-support-added
Let me know when the docs are ready for final review @tim-memryx We did a build of this PR and found that the docker image is substantially larger, from ~1G to ~5G, is this expected? It will definitely be problematic for our default build to be this large, so we would likely want to push this to a separate docker build variant if that was the case. |
Hi @NickM-27 , Yep, the docs are ready for review. Please let us know if they look okay or if we need any changes, thanks! Regarding the size, that's all from the We'll update our pip install commands in the dockerfile to clean this up now. |
-v /path/to/your/config:/config \ | ||
-v /etc/localtime:/etc/localtime:ro \ | ||
-e FRIGATE_RTSP_PASSWORD='password' \ | ||
--add-host gateway.docker.internal:host-gateway \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Users running docker compose will likely need to add an extra-hosts:
section too, correct? If so, it would be good to add this above.
Can you explain more about what's going on internally with this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! Yes, I’ve added it to the Compose file now. We use gateway.docker.internal:host-gateway
to allow the container to communicate with the device arbitration daemon running on the host.
@@ -132,6 +132,71 @@ If you are using `docker run`, add this option to your command `--device /dev/ha | |||
|
|||
Finally, configure [hardware object detection](/configuration/object_detectors#hailo-8l) to complete the setup. | |||
|
|||
### MemryX MX3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should go lower under the community supported section for now. You can also add it to the info card at the top to link to it
Due to the MX3's architecture, the maximum frames per second supported cannot be calculated as `1/inference time` and is measured separately. When deciding how many camera streams you may support with your configuration, use the **MX3 Total FPS** column to approximate of the detector's limit, not the Inference Time. | ||
|
||
|
||
| Model | Input Size | MX3 Inference Time | MX3 Total FPS | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's not include fps here, it is not really clear to users because Frigate can run multiple detections on the same frame, and since we don't have this metric on other sections it isn't directly comparable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We included the FPS information in case a user is reading the Coral TPU section that says:
You can calculate the maximum performance of your Coral based on the inference speed reported by Frigate. With an inference speed of 10, your Coral will top out at 1000/10=100, or 100 frames per second.
This calculation of 1000 / inference time (ms) = FPS
isn't true for the MX3, so we listed them separately. For example, for yolov9s-320, 1000/16ms = 62.5 FPS but the chip is actually running at 382 FPS (around 6x difference).
In other words: the input-to-output latency of a single frame for yolov9s-320 is 16 ms, while the time-between-output-frames is 2.6 ms (1/382*1000).
What if we remove the FPS column and instead have a single "Inference Speed" column with time-between-frames latency, instead of input-to-output latency? This would remove the extra column while still giving the user a general feeling for the accelerator's inference capacity relative to CPUs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This calculation of 1000 / inference time (ms) = FPS isn't true for the MX3, so we listed them separately. For example, for yolov9s-320, 1000/16ms = 62.5 FPS but the chip is actually running at 382 FPS (around 6x difference).
I'm not sure how that can be the case, if an inference as shown in the frigate UI is 16 ms then frigate will only be able to run 62.5 detections in that second, which is exactly what these tables should be presenting to the user.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently in the async_run_detector, the duration
timestamps start when the input thread pushes a new frame to the hardware detector, and stop when the output thread receives the frame, so this would be "in-to-out" latency.
But the output frames to the user are updated more frequently than this, because the async pipeline has multiple frames in-flight at a time. The time between updates to the user is the "frame-to-frame" latency, from which we calculate FPS.
Note for the synchronous run_detector
, "in-to-out" and "frame-to-frame" latencies are equal.
Since the goal is to show the users the detections per second, we can modify async_run_detector
to report duration
as time between outputs ("frame-to-frame")? Then we'll redo the benchmarks and have a single column in the docs.
Would this be okay?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I am mostly understanding, how much of the inference is truly parallelized though?
Hey Tim. I went ahead and tested the latest code here in the PR. I re-ran the installation script, restarted, and I'm seeing this:
I do see the device showing up in dmesg:
I can see it in
Can you let me know if I've missed something? |
regarding the image size, it is better but still about twice the size of the image before this PR. It looks like torch-cpu is using a lot of it, is this required for inference? |
Hi @hawkeye217 , Thank you for the update. When you get a chance, could you please check the following to help troubleshoot the connection issue? 1. Service StatusPlease check if the MemryX daemon is running and connected: sudo service mxa-manager status 2. Listening InterfaceEnsure the daemon is listening on all interfaces. sudo sed -i 's/^LISTEN_ADDRESS=.*/LISTEN_ADDRESS="0.0.0.0"/' /etc/memryx/mxa_manager.conf After making the above changes, please restart the daemon: sudo service mxa-manager restart 3. Firewall Rules (if using UFW)If your system uses ufw, the firewall might be blocking access between the container and the host. sudo ufw allow from 172.17.0.0/24 to any Please let me know if you're still seeing issues afterward — happy to help further! |
Here's what I see:
|
I apologize. Could you please try:
|
|
Thanks for sharing that — this confirms:
Just to rule out any firewall-related issues, could you please try temporarily disabling UFW with:
After that, can you please restart the container and check. Thank you. |
Thanks for the assistance. Disabling Disabling the firewall isn't really a long-term solution, so what firewall allow rules are needed to make this work correctly? This may be a hangup for our very security-conscious and privacy-focused users, who run on a variety of platforms, operating systems, and hardware. None of the other detectors that Frigate supports needs to have network connections back to the host in any way, so it probably would be good to include exactly what the |
Something else of note - it looks like the model files are downloading every time I restart Frigate.
|
Hi @hawkeye217 , For firewall rules, we need the host to be able to accept connections from the docker virtual interface to the host for mxa_manager connections (default to ports 10000, 10001, and 10002). I think the issue might be that the docker interfaces can have varying IPs on different systems, or maybe
The network config can definitely be tricky/risky for users -- on that note, the upcoming SDK release (in 1 month or so) supports using unix domain sockets instead of TCP/IP, and defaults to this, which should make the docker config a lot easier since it's just a matter of adding a volume mount for the socket file. |
Thanks, Tim. That's great. We look forward to the upcoming release. |
Proposed change
This PR adds support for the MemryX MX3 AI accelerator as an option for object detectors (SDK link).
We've included options for the following models:
We've added a section to the
main/Dockerfile
that installs necessary dependencies inside the container and downloads "DFP" files for each model (this is what we call compiled models). But note that these are compiled from their upstream sources (OpenVINO and Ultralytics**) not retrained nor quantized, as the MX3 runs models with floating point math.We've tested the containers across x86 systems, Raspberry Pi, and Orange Pi.
Now to summarize the changed/added files:
Setup Scripts & Dockerfile
New File:
memryx/user_installation.sh
Purpose:
Installs required MX3 drivers and core libraries on the container host machine.
Note:
Assumes a Debian/Ubuntu host (or their derivative distros such as Raspberry Pi OS and Armbian).
Dockerfile Modifications:
main/Dockerfile
Added section to install dependencies (core libraries + Python pip packages) and download DFPs.
Frigate Additions/Modifications
New File:
frigate/detector/plugins/memryx.py
Implements the detector using MX3 for the currently supported models. Called by
async_run_detector
.Modified File:
frigate/object_detection/base.py
Asynchronous
async_run_detector()
function:This function replaces the blocking call to
detect_raw
with two asynchronous Python threads: one that calls the detector'ssend_input
and another that callsreceive_output
. This allows a purely async architecture such as the MX3 to reach maximum FPS.The
async_run_detector()
was made to be agnostic to the MemryX detector and potentially useful for others, while MemryX-specific functionality is kept separate in the detector plugin file.Summary
This adds MemryX MX3 and multiple object detection models.
Please let us know if there are any changes you would like to see, as we're very excited to be added to Frigate!
Type of change
Checklist
ruff format frigate
)