Chronos is an event-driven video ingestion and analytics engine. It slices live video streams, processes them with Computer Vision (YOLO) in real-time, stores semantic embeddings in a Vector Database (Qdrant), and allows for natural language search (e.g., "Find me a clip of a bird").
Ingestion (Go) -> Streaming (Kafka) -> Inference (Python/YOLO) -> Storage (Qdrant)
- Stream Slicer (Go): Downloads video streams, segments them into 5-second chunks, and publishes metadata to Kafka.
- Message Broker (Redpanda/Kafka): Decouples ingestion from processing to handle high-throughput backpressure.
- ML Worker (Python): Consumes messages, runs YOLOv8 object detection, and generates text descriptions (e.g., "A video containing: person, car").
- Vector Store (Qdrant): Stores the semantic embeddings of the descriptions.
- Search API: Allows querying the video library using natural language.
- Language: Go (Ingestion), Python 3.13 (ML/Inference)
- AI Models: Ultralytics YOLOv8 (Vision), SentenceTransformers (Embeddings)
- Infrastructure: Docker, Redpanda (Kafka compatible), Qdrant (Vector DB)
- Libraries: confluent-kafka, requests (Direct API implementation)