A PyTorch implementation of I-JEPA (Image Joint-Embedding Predictive Architecture), inspired by the work of Yann LeCun and Meta AI.
I-JEPA is a self-supervised learning framework introduced in the paper:
"Self-supervised learning from images with a joint-embedding predictive architecture"
Yann LeCun, Mathilde Caron, Piotr Bojanowski, Armand Joulin, Ishan Misra, et al.
arXiv:2301.08243
Unlike pixel-level reconstruction methods (e.g., MAE), I-JEPA encourages models to reason at a semantic level by predicting high-level representations of masked image regions. This results in more robust and scalable visual representations for downstream tasks.
- Predicts latent feature embeddings, not raw pixels
- Uses block-based masking and Vision Transformers (ViT)
- Dual-network architecture: encoder & predictor
- Flexible mask collator with custom scale/aspect-ratio
- Simple and extensible codebase for research or experimentation
Citation:
@article{lecun2023ijepa,
title={Self-supervised learning from images with a joint-embedding predictive architecture},
author={LeCun, Yann and Caron, Mathilde and Bojanowski, Piotr and Joulin, Armand and Misra, Ishan and Synnaeve, Gabriel and Zhai, Xiaohua},
journal={arXiv preprint arXiv:2301.08243},
year={2023}
}- Core concept and methodology by Meta AI Research.
- Masking and collator logic inspired by the official I-JEPA and DINO repositories.
- GitHub: @aymen-000
- Email: aymne011@gmail.com
⭐️ Star this repo if you find it useful!
