UI available now #32
Replies: 3 comments 2 replies
-
hey man, great work indeed! I'm actively working on modifying this project for my needs for last 3 days. it works perfect with "anthropic/claude-3.5-sonnet:beta", but it's quite costly, 10 frame analyse costs around $0.1. Did you have any chance with the open source LLMs such as qwen/qwen-2-vl-7b-instruct? In huggingface benchmarks, they show good results, but it hallucinating a lot. Here's the result for 1 frame video(attached the frame too). It happens with multi-frame videos too. |
Beta Was this translation helpful? Give feedback.
-
I did couple of llama 11b , it's better than qwen. I found this one performing best: "google/gemini-flash-1.5-8b"
Give it a try, it's a bit expensive than llama, but I think it is better |
Beta Was this translation helpful? Give feedback.
-
What I mean is the LLama was not fully able to describe the captured frame, thus having the missing context from the description of the video(describe.txt). So, the issue for me was in LLama performance. I did 5 different videos with both LLama and Gemini, Gemini gave me the desired result in all of them, while LLama maybe 2 good results. |
Beta Was this translation helpful? Give feedback.
-
Simple UI available with drag and drop.
Just view the instructions in the UI Directory
Install
pip install video-analyzer-ui
Run
video-analyzer-ui
Beta Was this translation helpful? Give feedback.
All reactions