-
Notifications
You must be signed in to change notification settings - Fork 8
Open
Description
Dear NVIDIA team,
I am currently testing GDINO on the JPS, step-by-step from this link. However, I get a lot slower inference (2-3 FPS vs 11.6 FPS advertised), am I missing something here?
FYI, I tested it with NVIDIA AGX Orin 64GB. Jetpack R36.4
# R36 (release), REVISION: 4.3, GCID: 38968081, BOARD: generic, EABI: aarch64, DATE: Wed Jan 8 01:49:37 UTC 2025
# KERNEL_VARIANT: oot
TARGET_USERSPACE_LIB_DIR=nvidia
TARGET_USERSPACE_LIB_DIR_PATH=usr/lib/aarch64-linux-gnu/nvidia
For debugging purpose, I attach experiment logs and step-by-step on how to produce the results below.
INFO
Image can be obtained here.
Input: jpeg, 800x1600px
Prompt:
{
"model": "Grounding-Dino",
"messages": [
{
"role": "user",
"content": [
{
"type": "text",
"text": "car, bike"
},
{
"type": "media_url",
"media_url": {
"url": "data:image/jpeg;asset_id,7f1465db-8f52-4641-822d-d6e94f1a63e2"
}
}
]
}
],
"threshold": 0.35
}
Experiment and Log
Here is how I test it, which is the same as in the tutorial link above.
Step 1.
curl -X POST "http://localhost:8000/files" -H "Content-Type: multipart/form-data" -F purpose="vision" -F media_type="image" -F "file=image.jpe
g"
Step 2.
curl -X POST http://localhost:8000/inference -H "Content-Type: application/json" -d @test.json
I also did a for-loop test like below:
import time
import os
# os.system('curl -X POST "http://localhost:8000/files" -H "Content-Type: multipart/form-data" -F purpose="vision" -F media_type="image" -F "[email protected]')
for _ in range(10):
start = time.time()
os.system('curl -X POST http://localhost:8000/inference -H "Content-Type: application/json" -d @test.json')
print(_, f'prompt time: {time.time()-start:.3f}s')
And I get these results (2-3FPS)
Metadata
Metadata
Assignees
Labels
No labels
