Skip to content

Latest commit

 

History

History
17 lines (7 loc) · 896 Bytes

README.md

File metadata and controls

17 lines (7 loc) · 896 Bytes

Yolov_LLM

Video Object Detection (Yolov5) and Multimodal Vision Language Model (Llava 13b)

Integrating YOLOv (You Only Look Once) with Large Language Models (LLMs) for Enhanced Object Detection and Contextual Understanding. This project combines state-of-the-art object detection with advanced language processing to improve accuracy and provide detailed context for detected objects. Ideal for applications in autonomous systems, surveillance, and AI-driven analytics.

HuggingFace Hub uploaded a fine-tuned model checkpoints- https://huggingface.co/AgamP/LLM_Custom_1/tree/main

The proposal, Project report and summary is available above, follow to understand the context.