A Streamlit-based web application for object detection using one of two models: YOLOv8 and GroundingDINO.
- YOLOv8 Detection: Fast object detection with support for 80 pre-defined classes
- GroundingDINO Detection: Flexible object detection using natural language descriptions
- Interactive UI: Upload images, view detections with bounding boxes, and get detailed results
- Multiple Detection Modes: YOLOv8 for speed, GroundingDINO for flexibility
- Python 3.8+ (tested on 3.14)
- pip
- Clone the repository:
git clone <repository-url>
cd <repository-name>- Create a virtual environment (optional but recommended):
python -m venv venv
# On Windows
venv\Scripts\activate
# On macOS/Linux
source venv/bin/activate- Install dependencies:
pip install -r requirements.txtstreamlit run Source/app.pyThe app will open in your default browser at http://localhost:8501
- Select a Detection Model: Choose between YOLOv8 or GroundingDINO
- Upload an Image: Click to upload an image file or copy from clipboad
- Configure Detection Settings:
- For GroundingDINO: Enter text descriptions of objects to detect
- For either: Adjust confidence threshold, which can be updated in real time
- View Results: The app displays bounding boxes, labels, and confidence scores
- Fast real-time detection
- Supports 80 object classes
- Model downloaded automatically on first run
- Best for: Known object categories, speed-critical applications
- Flexible object detection using natural language
- Detects any object described in text
- Automatically downloaded from Hugging Face on first run
- Best for: Custom object detection, flexible queries
See requirements.txt for complete dependency list:
- streamlit
- torch
- pandas
- pillow
- transformers
- ultralytics
- Model files are downloaded automatically on first use and cached locally
- Ensure you have sufficient disk space (~2GB) for model files
- GPU (CUDA) support recommended for faster inference (CPU mode supported)
