With recent advancements of machine learning, new compression algorithms appeared. These probabilistic algorithms are based on neural networks that can learn a better way to encode data with a minimum number of bits. The only issue being that these methods require way more processing power than previous state of the art compression algorithms. The goal of this project is to achieve real-time image compression on resource constrained platforms using frugal machine learning techniques such as pruning, quantisation and knowledge distillation.
PRIM/
├── balle_bdpsnr/ # Reproducing SOTA results
├── balle_reproduction/ # Reproducing SOTA results
├── data/ # Datasets
├── codecs_experiments/ # Experiments with codecs (JPEG 2000, JPEG, Webp)
├── hybrid_kd_lic_experiments/ # Experiments with KD for LIC (latent + hyper-latent representation and output with two different teachers)
├── hyper_kd_lic_experiments/ # Experiments with KD for LIC (latent + hyper-latent representation and output)
├── kd_ae/ # Experiments with KD for image reconstruction (latent representation and output)
├── kd_ae_test/ # Experiments with KD for image reconstruction (latent representation and output)
├── kd_lic_experiments/ # Experiments with KD for LIC (latent representation and output)
├── private/ # Personal data
├── reports/ # Reports (intermediate, final, papers)
├── PRIM.pdf # Initial presentation of the PRIM project
├── README.md # README
└── requirements.txt # Requirements
- The
balle_reproductioncontains the first step of the project, reproducing Ballé state-of-the-art results on a single model. - The second step was to reproduce state-of-the-art results for different bit rate/quality tradeoffs, this is contained in the
balle_bdpsnrfolder. - I then learned how to use knowledge distillation and tried to apply it on a simple auto-encoder model for image denoising / reconstruction, the experiments and results can be found in
kd_ae_test. Results are not impressive to say the least... - Next, I proceeded in implementing knowledge distillation on state-of-the-art image reconstruction model but using them as auto-encoders. Training teacher and student model from scratch produced great visual results. The code and results can be found in
kd_ae. - Finally, I adapted the previous code to perform LIC. I experimented with different student architectures (number of channels). Code and results are in the
kd_lic_experimentsfolder. - As discussed with tutors, it should also be possible to apply knowledge distillation on the hyper-latent space. This is what is done in
hyper_kd_lic_experiments. - I wanted to see if using two different teacher networks (one focused on rate for the hyper-latent space and one focused on distortion for the latent space and output) could yield improved results. This research work is contained in
hybrid_kd_lic_experiments.
- Neural/Learned Image Compression: An Overview
- End-to-end optimization of nonlinear transform codes for perceptual quality
- End-to-end Optimized Image Compression
- Variational Image Compression with a Scale Hyperprior
- Joint Autoregressive and Hierarchical Priors for Learned Image Compression
- The Devil Is in the Details: Window-based Attention for Image Compression
- A Survey on Visual Transformer
- Learned Image Compression with Mixed Transformer-CNN Architectures
- Distilling the Knowledge in a Neural Network
- Microdosing: Knowledge Distillation for GAN based Compression
- Fast and High-Performance Learned Image Compression With Improved Checkerboard Context Model, Deformable Residual Module, and Knowledge Distillation
- Cross-Architecture Knowledge Distillation
- Training data-efficient image transformers & distillation through attention
- Structural similarity index measure
- Peak signal-to-noise ratio
- Bjøntegaard Delta (BD): A Tutorial Overview of the Metric, Evolution, Challenges, and Recommendations
- CompressAI
- STF (GitHub reposirotry of "The Devil Is in the Details")
- Bjøntegaard Delta (BD): A Tutorial Overview of the Metric, Evolution, Challenges, and Recommendations
- Bjontegaard_metric
- Weights and Biases
- fvcore
- zeus
- pynvml
- Measuring GPU Energy: Best Practices
- Télécom Paris GPU Cluster Doc
- Fabien ALLEMAND