GuideLLM Benchmark Pipeline & Workbench

GuideLLM Pipeline

A configurable pipeline for running GuideLLM benchmarks against LLM endpoints.

GuideLLM Overview

GuideLLM evaluates and optimizes LLM deployments by simulating real-world inference workloads to assess performance, resource requirements, and cost implications across different hardware configurations.

Key Features

Performance & Scalability Testing: Analyze LLM inference under various load scenarios to meet SLOs
Resource & Cost Optimization: Determine optimal hardware configurations and deployment strategies
Flexible Deployment: Support for Kubernetes Jobs and Tekton Pipelines with configurable parameters
Automated Results: Timestamped output directories with comprehensive benchmark results

Usage

Running as Kubernetes Job

# Apply the PVC
kubectl apply -f utils/jobs/pvc.yaml

# Apply the ConfigMap (optional)
kubectl apply -f pipeline/config.yaml

# Run the job with default settings
kubectl apply -f utils/jobs/guidellm-job.yaml

# Or customize environment variables
kubectl set env job/run-guidellm TARGET=http://my-endpoint:8000/v1
kubectl set env job/run-guidellm MODEL_NAME=my-model

Running as Tekton Pipeline

# Apply the task and pipeline
kubectl apply -f pipeline/tekton-task.yaml
kubectl apply -f pipeline/tekton-pipeline.yaml

# Run with parameters
tkn pipeline start guidellm-benchmark-pipeline \
  --param target=http://llama32-3b.llama-serve.svc.cluster.local:8000/v1 \
  --param model-name=llama32-3b \
  --param processor=RedHatAI/Llama-3.2-3B-Instruct-quantized.w8a8 \
  --param data-config='{"type":"emulated","prompt_tokens":512,"output_tokens":128}' \
  --workspace name=shared-workspace,claimName=guidellm-output-pvc

(if you need to install the tkn executable, on a Mac, you will want to run brew install tektoncd-cli)

Once the Tekton pipeline starts, the GuideLLM benchmark CLI will be triggered with the input parameters:

The GuideLLM benchmark will begin running and start simulating real-world inference workloads against the target endpoint:

Configuration Options

Environment Variables

TARGET: Model endpoint URL
MODEL_NAME: Model identifier
PROCESSOR: Processor/model path
DATA_CONFIG: JSON data configuration
OUTPUT_FILENAME: Output file name
RATE_TYPE: Rate type (synchronous/poisson)
MAX_SECONDS: Maximum benchmark duration

Results

The benchmark generates comprehensive performance metrics and visualizations:

The results provide detailed insights into throughput, latency, resource utilization, and other key performance indicators to help optimize your LLM deployment strategy.

Output Structure

Results are organized in timestamped directories:

/output/
├── model-name_YYYYMMDD_HHMMSS/
│   ├── benchmark-results.yaml
│   └── benchmark_info.txt
└── model-name_YYYYMMDD_HHMMSS.tar.gz

GuideLLM Workbench

The GuideLLM Workbench provides a user-friendly Streamlit web interface for running benchmarks interactively with real-time monitoring and result visualization.

Features

Interactive Configuration: Easy-to-use forms for endpoint, authentication, and benchmark parameters
Real-time Monitoring: Live metrics parsing during benchmark execution
Quick Stats: Sidebar with key performance indicators (requests/sec, tokens/sec, latency, TTFT)
Results History: Session-based storage with detailed result viewing
Download Results: Export benchmark results as YAML files
Comprehensive Results View: Detailed breakdown of all performance metrics

Running the Workbench

Local Development

cd utils/guidellm-wb
pip install -r requirements.txt
streamlit run app.py

Container Deployment

# Use pre-built container from registry
podman run -p 8501:8501 quay.io/rh-aiservices-bu/guidellm-wb:v1

The workbench will be available at http://localhost:8501 and provides an intuitive interface for configuring and running GuideLLM benchmarks with immediate feedback and comprehensive result analysis.

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
assets		assets
notebooks		notebooks
pipeline		pipeline
utils		utils
.DS_Store		.DS_Store
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

GuideLLM Benchmark Pipeline & Workbench

GuideLLM Pipeline

Table of Contents

GuideLLM Overview

Key Features

Usage

Running as Kubernetes Job

Running as Tekton Pipeline

Configuration Options

Environment Variables

Results

Output Structure

GuideLLM Workbench

Features

Running the Workbench

Local Development

Container Deployment

About

Uh oh!

Releases

Packages

Contributors 3

Uh oh!

Languages

License

rh-aiservices-bu/guidellm-pipeline

Folders and files

Latest commit

History

Repository files navigation

GuideLLM Benchmark Pipeline & Workbench

GuideLLM Pipeline

Table of Contents

GuideLLM Overview

Key Features

Usage

Running as Kubernetes Job

Running as Tekton Pipeline

Configuration Options

Environment Variables

Results

Output Structure

GuideLLM Workbench

Features

Running the Workbench

Local Development

Container Deployment

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Uh oh!

Languages

Packages