I’m currently a researcher 🧪 working on multimodal representation learning and memory-augmented systems for Video-Language modeling. My work focuses on building and understanding systems that can store, retrieve, and reason over information across long time horizons, combining vision, language, and structured memory inspired by the structured way the human brain perceives, processes and stores memory. I also have some industry experience. I worked as a software engineer/android developer at Altice Labs and contributed to the evaluation and construction of Generative AI systems (RAG, multi-agentic pipelines).
- Memory-augmented Video-language modeling
- Multimodal representation learning (vision + language alignment)
- Efficiency in LLMs and VLMs over Long-context data
- LLM and VLM interpretability
- Retrieval-Augmented Generation architectures for LLMs (vector search + grounding)
- World models
- Efficient Coding of the Human Brain
- Software Engineer at Altice Labs → computer vision-based fitness systems for a Flutter-based mobile app 🏋🏻♀️

