aiha-lab

All

32 repositories

TANGRAM
Public
An Unstructured and Memory-Efficient Framework for LLM Serving and KV Cache Management.
JavaScript
•0•0•0•0•Updated Mar 13, 2026Mar 13, 2026
OSWorld-wkl
Public
[NeurIPS 2024] OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments
Python
•
Apache License 2.0
•416•0•0•0•Updated Feb 5, 2026Feb 5, 2026
InfiniPot-V
Public
[NeurIPS 25] InfiniPot-V: Memory-Constrained KV Cache Compression for Streaming Video Understanding
Python
•0•16•0•0•Updated Jan 25, 2026Jan 25, 2026
MapCoder-Lite
Public
Python
•
MIT License
•0•0•1•0•Updated Jan 23, 2026Jan 23, 2026
NPU-Based-Real-Time-Blind-Streaming-Assistant
Public
A real-time streaming assistant powered by Rebellions NPU, designed to operate without visual feedback and optimized for low-latency.
Python
•0•0•0•0•Updated Jan 22, 2026Jan 22, 2026
rp-framework
Public
Reduced-precision inference (PTQ) / training (QAT, FQT) framework for LLMs
Python
•0•1•0•0•Updated Dec 15, 2025Dec 15, 2025
RILQ
Public
Python
•
Apache License 2.0
•0•5•1•0•Updated Oct 24, 2025Oct 24, 2025
sqil
Public
Python
•0•6•0•0•Updated Oct 15, 2025Oct 15, 2025
nvfp4-emul
Public
NVFP4 Emulation Library
Python
•0•0•0•0•Updated Sep 2, 2025Sep 2, 2025
Mixup-class-Prompting
Public
0•0•0•0•Updated Jul 26, 2025Jul 26, 2025
qllm-infer
Public
Quantization Framework for LLM Inferences
Python
•3•6•0•0•Updated Mar 11, 2025Mar 11, 2025
MX-QLLM
Public
LLM Inference with Microscaling Format
Python
•5•34•3•0•Updated Nov 12, 2024Nov 12, 2024
pim-iree
Public
Compiler and runtime implementation for PIM device.
C++
•
Apache License 2.0
•864•2•0•0•Updated Dec 15, 2023Dec 15, 2023
serpim
Public
👻
C++
•
Apache License 2.0
•864•0•0•0•Updated Dec 14, 2023Dec 14, 2023
iree
Public
👻
C++
•
Apache License 2.0
•864•0•0•0•Updated Dec 14, 2023Dec 14, 2023
TSLD
Public
[NeurIPS 2023] Token-Scaled Logit Distillation for Ternary Weight Generative Language Models
Python
•1•18•0•0•Updated Dec 6, 2023Dec 6, 2023
TVM-VTA
Public
setting
CMake
•0•0•0•0•Updated Apr 28, 2023Apr 28, 2023
tpu-mlir
Public
Machine learning compiler based on MLIR for Sophgo TPU.
C++
•
Other
•201•0•0•0•Updated Jan 16, 2023Jan 16, 2023
AI-System-Design
Public
AI System Design - Final Project
0•0•0•0•Updated Dec 20, 2022Dec 20, 2022
AI-thermometer
Public
Python
•
Apache License 2.0
•2•10•0•1•Updated Nov 4, 2022Nov 4, 2022
Optimization-method-for-human-detection-on-street-view-CCTV-images
Public
Inference code for AI Challenge (Dec 2020)
Jupyter Notebook
•
GNU General Public License v3.0
•0•6•0•0•Updated Feb 22, 2022Feb 22, 2022
TernGEMM
Public
TernGEMM: General Matrix Multiply Library with Ternary Weights for Fast DNN Inference
C++
•
GNU General Public License v3.0
•1•14•1•0•Updated Feb 22, 2022Feb 22, 2022
Attention-Head-Pruning
Public
Layer-wise Pruning of Transformer Heads for Efficient Language Modeling
Python
•
GNU General Public License v3.0
•1•22•0•0•Updated Feb 22, 2022Feb 22, 2022
COCO-dataset-based-light-weight-fast-object-detection-model
Public
Python
•
GNU General Public License v3.0
•0•8•0•0•Updated Feb 22, 2022Feb 22, 2022
ProxylessNAS-cifar10
Public
Python
•
Apache License 2.0
•0•0•0•0•Updated Aug 31, 2021Aug 31, 2021
TVM-HAGO-lab
Public
Cuda
•0•0•0•0•Updated Aug 12, 2021Aug 12, 2021
lsq-lab
Public
Python
•
MIT License
•0•0•0•0•Updated Aug 9, 2021Aug 9, 2021
qpytorch_lab
Public
Samsung 2021 QPyTorch Lab
Jupyter Notebook
•1•0•0•0•Updated Aug 9, 2021Aug 9, 2021
optimus-timeloop
Public
optimus + timeloop implementation
Python
•
MIT License
•6•0•0•0•Updated May 10, 2021May 10, 2021
proxylessnas
Public
[ICLR 2019] ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware
Python
•
Apache License 2.0
•284•0•0•0•Updated Mar 12, 2020Mar 12, 2020