Differentiable Lookup-Based Matrix Multiplication for Compressing Transformer Network

Overview

In recent years, there has been research on replacing multiplication operations. The below graph has been gathered to compare the energy cost of multiplication and addition.

As a result, approaches like AdderNet replace multiplication in convolutions with addition, while ShiftCNN represents weights as powers of two, allowing multiplication to be replaced with bit-shift operations.

Furthermore, in recent research, some have replaced the Multiply-Accumulate (MAC) operations in matrix multiplication with table lookup and addition (source).

This research is a little complicated. If you find it interesting, further details are available here.

Usage

This is a research-oriented project, without a complete usage guide yet, but I can explain what these files intend to do.

demo.py: Compiles the model using TVM to find the optimal parameters (block size) for hardware and runs inference.
prototype_learning.py: Initializes prototypes using KMCUDA.
tensorrt_op.py: Attempts to compile the model using torch_tensorrt and runs it on the GPU after compilation.
train.py and other files containing "train": Retrains the model after replacing the lookup-based matrix multiplication.
OpCounter.ipynb: Measures the GFLOPs and model size after replacing the lookup-based matrix multiplication using thop.

Short Flow: prototype_learning.py -> train.py -> demo.py

Contributions

Developed a comprehensive training pipeline, particularly effective in handling ImageNet.
Achieved a significant accuracy improvement of up to 10% by surpassing LUT-NN at MobiCom 2023.

Contact

Feel free to contact me(a0917bc(at)gmail(dot)com) if you have any questions.

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
networks		networks
notebook		notebook
operations		operations
transforms		transforms
.gitignore		.gitignore
OpCounter.ipynb		OpCounter.ipynb
README.md		README.md
auto.sh		auto.sh
demo.ipynb		demo.ipynb
demo.py		demo.py
ema.py		ema.py
layer_experiment.sh		layer_experiment.sh
losses.py		losses.py
prototype_learning.py		prototype_learning.py
requirements.txt		requirements.txt
tensorrt_op.py		tensorrt_op.py
train.py		train.py
trainArgmax.py		trainArgmax.py
train_grad_acc.py		train_grad_acc.py
train_intermediate_layer.py		train_intermediate_layer.py
transformer.ipynb		transformer.ipynb
val.py		val.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Differentiable Lookup-Based Matrix Multiplication for Compressing Transformer Network

Overview

Usage

Contributions

Contact

References

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Differentiable Lookup-Based Matrix Multiplication for Compressing Transformer Network

Overview

Usage

Contributions

Contact

References

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages