vllm-project · maryamtahhan · Feb 27, 2026 · Apr 1, 2026 · Apr 2, 2026 · Apr 3, 2026
diff --git a/.gitignore b/.gitignore
@@ -233,3 +233,4 @@ src/ui/next-env.d.ts
 
 # e2e tests
 bin/
+embeddings_benchmarks.*
diff --git a/README.md b/README.md
@@ -199,7 +199,7 @@ guidellm benchmark run \
 
 ### Request Types and API Targets
 
-You can benchmark chat completions, text completions, or other supported request types. This example configures the benchmark to test chat completions API using a custom dataset file, with GuideLLM automatically formatting requests to match the chat completions schema.
+You can benchmark chat completions, text completions, embeddings, or other supported request types. This example configures the benchmark to test chat completions API using a custom dataset file, with GuideLLM automatically formatting requests to match the chat completions schema.
 
 ```bash
 guidellm benchmark \
@@ -208,9 +208,20 @@ guidellm benchmark \
   --data path/to/data.json
 ```
 
+For embeddings endpoints, use the dedicated embeddings command:
+
+```bash
+guidellm benchmark run-embeddings \
+  --target http://localhost:8000 \
+  --model text-embedding-3-small \
+  --data "prompt_tokens=256" \
+  --max-requests 100
+```
+
 **Key parameters:**
 
 - `--request-type`: Specifies the API endpoint format - options include `chat_completions` (chat API format), `completions` (text completion format), `audio_transcription` (audio transcription), and `audio_translation` (audio translation).
+- For embeddings: Use `guidellm benchmark run-embeddings` with `--encoding-format` (float or base64) to control output format.
 
 ### Using Scenarios
Original file line number	Diff line number	Diff line change
Expand Up		@@ -233,3 +233,4 @@ src/ui/next-env.d.ts

		# e2e tests
		bin/
		embeddings_benchmarks.*