An efficient FP16 and FP8 GPU kernel for Double Sparse. python inference_demo.py --execution_mode 1 --compressed_model_path elvircrn/llama2-7b-double-sparse-sparsity0.7-wikitext2-final --pretrained_model_path <Llama-2-7b-hf_path>