Skip to content

Commit 4a0ce73

Browse files
committed
Add comprehensive CI testing framework for markdown-based end-to-end tests
- Implement sequential server testing with complete lifecycle management - Add markdown parser for extracting tagged bash commands from documentation - Create server manager with background process handling and port cleanup - Implement AIPerf manager for test execution and result tracking - Add test orchestrator for coordinating complete test workflows - Support health checks with configurable timeouts (15s intervals, 5min max) - Add comprehensive port and Docker container cleanup (non-critical failures) - Configure timeouts: 20min server setup, 2min test commands - Support repository-wide markdown file discovery with smart filtering - Remove emojis for professional CI output - Add detailed logging and real-time command output streaming
1 parent 0637181 commit 4a0ce73

File tree

8 files changed

+1799
-43
lines changed

8 files changed

+1799
-43
lines changed

docs/tutorial.md

Lines changed: 74 additions & 43 deletions
Original file line numberDiff line numberDiff line change
@@ -26,11 +26,25 @@ models using various inference solutions.
2626

2727
</br>
2828

29+
<!-- aiperf-setup -->
30+
```bash
31+
#install and setup aiperf
32+
# Create and activate virtual environment
33+
python3 -m venv .venv
34+
source .venv/bin/activate
35+
36+
# Install aiperf from GitHub
37+
pip install git+https://github.com/ai-dynamo/aiperf.git
38+
39+
```
40+
<!-- /aiperf-setup -->
41+
2942
## Profile Qwen3-0.6B using Dynamo <a id="dynamo-qwen3-0.6B">
3043

3144
> [!NOTE]
3245
> The most up to date installation instructions for Dynamo are available on [Github](https://github.com/ai-dynamo/dynamo?tab=readme-ov-file#1-initial-setup)
3346
47+
<!-- setup-dynamo-default-openai-endpoint-server -->
3448
```bash
3549
# set environment variables
3650
export AIPERF_REPO_TAG="main"
@@ -54,76 +68,93 @@ docker run \
5468
--gpus all \
5569
--network host \
5670
${DYNAMO_PREBUILT_IMAGE_TAG} \
57-
/bin/bash -c "python3 -m dynamo.frontend & python3 -m dynamo.vllm --model ${MODEL} --enforce-eager --no-enable-prefix-caching" > server.log 2>&1 &
58-
59-
# Set up AIPerf
60-
docker run \
61-
-it \
62-
--rm \
63-
--gpus all \
64-
--network host \
65-
-e AIPERF_REPO_TAG=${AIPERF_REPO_TAG} \
66-
-e MODEL=${MODEL} \
67-
ubuntu:24.04
68-
69-
apt update && apt install -y curl git
70-
71-
curl -LsSf https://astral.sh/uv/install.sh | sh
72-
73-
source $HOME/.local/bin/env
74-
75-
uv venv --python 3.10
76-
77-
source .venv/bin/activate
78-
79-
git clone -b ${AIPERF_REPO_TAG} --depth 1 https://github.com/ai-dynamo/aiperf.git
71+
/bin/bash -c "python3 -m dynamo.frontend & python3 -m dynamo.vllm --model ${MODEL} --enforce-eager --no-enable-prefix-caching"
72+
```
73+
<!-- /setup-dynamo-default-openai-endpoint-server -->
8074

81-
uv pip install ./aiperf
8275

83-
# At this point, Dynamo may not be ready.
84-
# The following command will return when Dynamo is ready for requests.
85-
while [ "$(curl -s -o /dev/null -w '%{http_code}' localhost:8080/v1/chat/completions -H 'Content-Type: application/json' -d '{"model":"'"${MODEL}"'","messages":[{"role":"user","content":"a"}],"max_completion_tokens":1}')" != "200" ]; do sleep 1; done
76+
<!-- health-check-dynamo-default-openai-endpoint-server -->
77+
```bash
78+
# At this point, Dynamo server may not be ready.
79+
# The following command will return when Dynamo server is ready for requests.
80+
# Try for 5 minutes (20 attempts at 15-second intervals)
81+
attempt=0
82+
while [ $attempt -lt 20 ] && [ "$(curl -s -o /dev/null -w '%{http_code}' localhost:8080/v1/chat/completions -H 'Content-Type: application/json' -d '{"model":"'"${MODEL}"'","messages":[{"role":"user","content":"a"}],"max_completion_tokens":1}')" != "200" ]; do
83+
echo "Waiting for Dynamo server to be ready..."
84+
sleep 15
85+
attempt=$((attempt + 1))
86+
done
87+
```
88+
<!-- /health-check-dynamo-default-openai-endpoint-server -->
8689

8790
# Profile the model
91+
92+
<!-- aiperf-run-dynamo-default-openai-endpoint-server -->
93+
```bash
8894
aiperf profile \
8995
--model Qwen/Qwen3-0.6B \
9096
--endpoint-type chat \
9197
--endpoint /v1/chat/completions \
9298
--streaming \
9399
--url localhost:8080 \
94-
--synthetic-input-tokens-mean 1000 \
100+
--synthetic-input-tokens-mean 10 \
95101
--synthetic-input-tokens-stddev 0 \
96-
--output-tokens-mean 2000 \
102+
--output-tokens-mean 20 \
97103
--output-tokens-stddev 0 \
98-
--extra-inputs min_tokens:2000 \
104+
--extra-inputs min_tokens:2 \
99105
--extra-inputs ignore_eos:true \
100-
--concurrency 2048 \
101-
--request-count 6144 \
102-
--warmup-request-count 1000 \
106+
--concurrency 2 \
107+
--request-count 32 \
108+
--warmup-request-count 2 \
103109
--conversation-num 8000 \
104-
--random-seed 100 \
110+
--random-seed 8 \
105111
-v \
106112
-H 'Authorization: Bearer NOT USED' \
107113
-H 'Accept: text/event-stream'
108114
```
115+
<!-- /aiperf-run-dynamo-default-openai-endpoint-server -->
109116

110117
## Profile Qwen3-0.6B using vllm <a id="vllm-qwen3-0.6B">
118+
119+
<!-- setup-vllm-default-openai-endpoint-server -->
111120
```bash
112-
# Install vLLM from pip:
113-
pip install vllm
121+
# Create Python virtual environment for vLLM
122+
docker pull vllm/vllm-openai:latest
123+
docker run --gpus all -p 8000:8000 vllm/vllm-openai:latest \
124+
--model Qwen/Qwen3-0.6B \
125+
--host 0.0.0.0 --port 8000
126+
```
127+
<!-- /setup-vllm-default-openai-endpoint-server -->
114128

115-
# Load and run the model:
116-
vllm serve "Qwen/Qwen3-0.6B"
129+
<!-- health-check-vllm-default-openai-endpoint-server -->
130+
```bash
131+
# At this point, vLLM server may not be ready.
132+
# The following command will return when vLLM server is ready for requests.
133+
134+
MODEL="Qwen/Qwen3-0.6B"
135+
# Try for 5 minutes (20 attempts at 15-second intervals)
136+
attempt=0
137+
while [ $attempt -lt 20 ] && [ "$(curl -s -o /dev/null -w '%{http_code}' \
138+
http://localhost:8000/v1/chat/completions \
139+
-H 'Content-Type: application/json' \
140+
-d '{"model":"'"${MODEL}"'","messages":[{"role":"user","content":"ping"}],"max_completion_tokens":1}')" != "200" ]; do
141+
echo "Waiting for vLLM server to be ready..."
142+
sleep 15
143+
attempt=$((attempt + 1))
144+
done
145+
```
146+
<!-- health-check-vllm-default-openai-endpoint-server -->
117147

118-
uv venv
119-
source .venv/bin/activate
120-
pip install git+https://github.com/ai-dynamo/aiperf.git
121148

149+
<!-- aiperf-run-vllm-default-openai-endpoint-server -->
150+
```bash
122151
aiperf profile \
123152
--model Qwen/Qwen3-0.6B \
153+
--url localhost:8000 \
124154
--endpoint-type chat \
125155
--endpoint /v1/chat/completions \
126156
--streaming \
127-
--request-rate 1000 \
128-
--request-count 6500
157+
--request-rate 2 \
158+
--request-count 8
129159
```
160+
<!-- /aiperf-run-vllm-default-openai-endpoint-server -->
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
# Test framework for markdown-based CI testing
Lines changed: 156 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,156 @@
1+
#!/usr/bin/env python3
2+
"""
3+
AIPerf Manager - Handles AIPerf setup and test execution
4+
"""
5+
6+
import logging
7+
from typing import List, Dict, Optional
8+
from dataclasses import dataclass
9+
10+
logger = logging.getLogger(__name__)
11+
12+
@dataclass
13+
class AIPerfTestResult:
14+
"""Result of an AIPerf test run"""
15+
server_id: str
16+
command: 'MarkdownCommand'
17+
success: bool
18+
execution_time: float
19+
error_message: Optional[str] = None
20+
21+
class AIPerfManager:
22+
"""Manages AIPerf setup and test execution"""
23+
24+
def __init__(self, server_manager):
25+
self.server_manager = server_manager
26+
self.setup_completed = False
27+
self.setup_command = None
28+
self.test_results: List[AIPerfTestResult] = []
29+
30+
def discover_aiperf_commands(self, commands: List['MarkdownCommand']) -> None:
31+
"""Discover AIPerf setup and run commands"""
32+
logger.info("Discovering AIPerf commands...")
33+
34+
for cmd in commands:
35+
if cmd.tag_name == 'aiperf-setup':
36+
self.setup_command = cmd
37+
logger.info(f"Found AIPerf setup command: {cmd.file_path}:{cmd.start_line}")
38+
39+
if not self.setup_command:
40+
logger.warning("No AIPerf setup command found")
41+
42+
def setup_aiperf(self) -> bool:
43+
"""Setup AIPerf (run only once)"""
44+
if self.setup_completed:
45+
logger.info("AIPerf already set up, skipping...")
46+
return True
47+
48+
if not self.setup_command:
49+
logger.error("No AIPerf setup command available")
50+
return False
51+
52+
logger.info("Setting up AIPerf...")
53+
success = self.server_manager._execute_command(self.setup_command, timeout=120) # 2 minute timeout for setup
54+
55+
if success:
56+
self.setup_completed = True
57+
logger.info("AIPerf setup completed successfully")
58+
else:
59+
logger.error("AIPerf setup failed")
60+
61+
return success
62+
63+
def run_tests_for_server(self, server_id: str) -> List[AIPerfTestResult]:
64+
"""Run all AIPerf tests for a specific server"""
65+
if not self.setup_completed:
66+
logger.error("AIPerf not set up yet")
67+
return []
68+
69+
if server_id not in self.server_manager.servers:
70+
logger.error(f"Server {server_id} not found")
71+
return []
72+
73+
server = self.server_manager.servers[server_id]
74+
if not server.aiperf_run_commands:
75+
logger.info(f"No AIPerf run commands found for server {server_id}")
76+
return []
77+
78+
logger.info(f"Running {len(server.aiperf_run_commands)} AIPerf tests for server {server_id}")
79+
80+
results = []
81+
for i, cmd in enumerate(server.aiperf_run_commands, 1):
82+
logger.info(f"Running AIPerf test {i}/{len(server.aiperf_run_commands)} for {server_id}")
83+
84+
import time
85+
start_time = time.time()
86+
success = self.server_manager._execute_command(cmd, timeout=120) # 2 minute timeout for tests
87+
execution_time = time.time() - start_time
88+
89+
result = AIPerfTestResult(
90+
server_id=server_id,
91+
command=cmd,
92+
success=success,
93+
execution_time=execution_time,
94+
error_message=None if success else f"Test failed for {cmd.tag_name}"
95+
)
96+
results.append(result)
97+
self.test_results.append(result)
98+
99+
if success:
100+
logger.info(f"AIPerf test {i} completed successfully in {execution_time:.2f}s")
101+
else:
102+
logger.error(f"AIPerf test {i} failed after {execution_time:.2f}s")
103+
104+
return results
105+
106+
def run_all_tests(self) -> Dict[str, List[AIPerfTestResult]]:
107+
"""Run all AIPerf tests for all servers"""
108+
if not self.setup_completed:
109+
logger.error("AIPerf not set up yet")
110+
return {}
111+
112+
all_results = {}
113+
114+
for server_id, server in self.server_manager.servers.items():
115+
if server.aiperf_run_commands:
116+
logger.info(f"Running tests for server: {server_id}")
117+
results = self.run_tests_for_server(server_id)
118+
all_results[server_id] = results
119+
else:
120+
logger.info(f"No tests to run for server: {server_id}")
121+
all_results[server_id] = []
122+
123+
return all_results
124+
125+
def log_test_summary(self):
126+
"""Log summary of all test results"""
127+
logger.info("="*80)
128+
logger.info("AIPERF TEST RESULTS SUMMARY")
129+
logger.info("="*80)
130+
131+
total_tests = len(self.test_results)
132+
successful_tests = sum(1 for result in self.test_results if result.success)
133+
failed_tests = total_tests - successful_tests
134+
135+
logger.info(f"Total Tests: {total_tests}")
136+
logger.info(f"Successful: {successful_tests}")
137+
logger.info(f"Failed: {failed_tests}")
138+
logger.info(f"Success Rate: {(successful_tests/total_tests*100):.1f}%" if total_tests > 0 else "N/A")
139+
140+
# Group by server
141+
server_results = {}
142+
for result in self.test_results:
143+
if result.server_id not in server_results:
144+
server_results[result.server_id] = []
145+
server_results[result.server_id].append(result)
146+
147+
for server_id, results in server_results.items():
148+
logger.info(f"\nServer: {server_id}")
149+
for i, result in enumerate(results, 1):
150+
status = "PASS" if result.success else "FAIL"
151+
logger.info(f" Test {i}: {status} ({result.execution_time:.2f}s) - {result.command.tag_name}")
152+
if not result.success and result.error_message:
153+
logger.info(f" Error: {result.error_message}")
154+
155+
logger.info("="*80)
156+

0 commit comments

Comments
 (0)