Software Engineer @ Meta LLM Inference
- Menlo Park
-
04:47
(UTC +03:00) - in/ye-charlotte-qi
Highlights
- Pro
Pinned Loading
-
neulab/word-embeddings-for-nmt
neulab/word-embeddings-for-nmt PublicSupplementary material for "When and Why Are Pre-trained Word Embeddings Useful for Neural Machine Translation?" at NAACL 2018
-
pytorch/pytorch
pytorch/pytorch PublicTensors and Dynamic neural networks in Python with strong GPU acceleration
-
vllm-project/vllm
vllm-project/vllm PublicA high-throughput and memory-efficient inference and serving engine for LLMs
-
EleutherAI/lm-evaluation-harness
EleutherAI/lm-evaluation-harness PublicA framework for few-shot evaluation of language models.
-
vllm-project/flash-attention
vllm-project/flash-attention PublicForked from Dao-AILab/flash-attention
Fast and memory-efficient exact attention
-
Dao-AILab/flash-attention
Dao-AILab/flash-attention PublicFast and memory-efficient exact attention
Something went wrong, please refresh the page to try again.
If the problem persists, check the GitHub status page or contact support.
If the problem persists, check the GitHub status page or contact support.