Skip to content

elvircrn/double_sparse_kernel

Repository files navigation

An efficient FP16 and FP8 GPU kernel for Double Sparse.

python inference_demo.py --execution_mode 1 --compressed_model_path elvircrn/llama2-7b-double-sparse-sparsity0.7-wikitext2-final --pretrained_model_path <Llama-2-7b-hf_path>

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published