Tabrizian

Iman Tabrizian Tabrizian

Achievements

NVIDIA/TensorRT-LLM NVIDIA/TensorRT-LLM Public

TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. Tensor…

Python 13.1k 2.2k
triton-inference-server/server triton-inference-server/server Public

The Triton Inference Server provides an optimized cloud and edge inferencing solution.

Python 10.4k 1.7k
triton-inference-server/python_backend triton-inference-server/python_backend Public

Triton backend that enables pre-process, post-processing and other logic to be implemented in Python.

C++ 673 194
learning-to-quantize learning-to-quantize Public

Code for "Adaptive Gradient Quantization for Data-Parallel SGD", published in NeurIPS 2020.

Jupyter Notebook 30 5
triton-inference-server/model_analyzer triton-inference-server/model_analyzer Public

Triton Model Analyzer is a CLI tool to help with better understanding of the compute and memory requirements of the Triton Inference Server models.

Python 507 85