Llama3.1-8B characteristics and profiling

1. Llama3.1-8B profling on H100 (Optimized and Unoptimized)
2. Llama3.1-8B profling on L40S  (Optimized and Unoptimized)
3. LLama3.1-8B capture GPU memory and bandwidth 
4. Endpoint metrics and statistics 
5. Scope for improvement and obtaining best numbers ? 
6. What's the best we could have got via autotune - 
7. Autotune with data distribution and batch size 
8. Autotune with different vllm configs and get the best number 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Llama3.1-8B characteristics and profiling #26

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Llama3.1-8B characteristics and profiling #26

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions