Pinned Loading
Repositories
- grps_trtllm Public
Higher performance OpenAI LLM service than vLLM serve: A pure C++ high-performance OpenAI LLM service implemented with GPRS+TensorRT-LLM+Tokenizers.cpp, supporting chat and function call, AI agents, distributed multi-GPU inference, multimodal capabilities, and a Gradio chat interface.
NetEase-Media/grps_trtllm’s past year of commit activity - grps Public
Deep Learning Deployment Framework: Supports tf/torch/trt/trtllm/vllm and other NN frameworks. Support dynamic batching, and streaming modes. It is dual-language compatible with Python and C++, offering scalability, extensibility, and high performance. It helps users quickly deploy models and provide services through HTTP/RPC interfaces.
NetEase-Media/grps’s past year of commit activity - TensorRT-LLM Public Forked from NVIDIA/TensorRT-LLM
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and support state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that orchestrate the inference execution in performant way.
NetEase-Media/TensorRT-LLM’s past year of commit activity - HSTU-Tensorflow Public
NetEase-Media/HSTU-Tensorflow’s past year of commit activity - ControlTalk Public
Official code for "Controllable Talking Face Generation by Implicit Facial Keypoints Editing"
NetEase-Media/ControlTalk’s past year of commit activity
People
This organization has no public members. You must be a member to see who’s a part of this organization.
Top languages
Loading…
Most used topics
Loading…