Skip to content

feat(batch): add multi-endpoint body validation and testing#1982

Open
NJX-njx wants to merge 2 commits intovllm-project:mainfrom
NJX-njx:feat/1653-batch-api-multi-endpoint
Open

feat(batch): add multi-endpoint body validation and testing#1982
NJX-njx wants to merge 2 commits intovllm-project:mainfrom
NJX-njx:feat/1653-batch-api-multi-endpoint

Conversation

@NJX-njx
Copy link

@NJX-njx NJX-njx commented Mar 4, 2026

Summary

Expands batch API to fully validate and test all supported endpoints: /v1/chat/completions, /v1/completions, /v1/embeddings, and /v1/rerank.

Changes

1. Endpoint-specific body validation (\�atch.py)

Added _validate_request_body_for_endpoint()\ that checks request bodies have the required fields for each endpoint type:

Endpoint Required Body Fields
/v1/chat/completions\ \model, \messages\ (array)
/v1/completions\ \model, \prompt\ (string or array)
/v1/embeddings\ \model, \input\ (string or array)
/v1/rerank\ \model, \query\ (string), \documents\ (array)

This validation runs during _validate_batch_input_file()\ to catch malformed requests early before job execution.

2. Multi-endpoint input data generators

Both e2e test files (\ est_e2e_openai_batch_api.py\ and \ ests/e2e/test_batch_api.py) now have:

  • \ENDPOINT_SAMPLE_BODIES\ dict with valid sample request bodies per endpoint
  • \generate_batch_input_data(endpoint=...)\ supporting all 4 endpoint types

3. Parametrized e2e test

Added \ est_openai_batch_api_multi_endpoint\ that exercises the full batch workflow (upload create poll verify) for each of the 4 endpoint types.

4. Body validation unit tests

Added \TestEndpointBodyValidation\ class in \ est_batch_endpoints.py\ with 20+ test cases covering:

  • Valid bodies for all endpoints
  • Missing required fields
  • Invalid field types (wrong types for messages, prompt, input, documents, query)
  • Parametrized cross-endpoint validation

Closes #1653

Expand batch API testing and validation to cover all supported endpoints:
/v1/chat/completions, /v1/completions, /v1/embeddings, and /v1/rerank.

Changes:
- Add endpoint-specific body validation in _validate_batch_input_file:
  - chat/completions: requires model + messages (array)
  - completions: requires model + prompt (string or array)
  - embeddings: requires model + input (string or array)
  - rerank: requires model + query (string) + documents (array)
- Add multi-endpoint input data generators with sample bodies for each type
- Add parametrized e2e test (test_openai_batch_api_multi_endpoint) covering all 4 endpoints
- Add TestEndpointBodyValidation test class with 20+ unit tests for body validation
- Update e2e test generators to accept endpoint parameter

Closes vllm-project#1653
Copilot AI review requested due to automatic review settings March 4, 2026 11:21
@chatgpt-codex-connector
Copy link

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly improves the robustness and reliability of the batch API by introducing comprehensive validation for request bodies across all supported AI endpoints. It ensures that incoming batch job requests adhere to the specific requirements of each endpoint, preventing malformed data from proceeding to job execution. Furthermore, the changes include extensive testing to verify the correct functioning of this new validation logic and the overall batch workflow for multi-endpoint scenarios.

Highlights

  • Endpoint Body Validation: Implemented a new function _validate_request_body_for_endpoint to ensure request bodies for various batch API endpoints (/v1/chat/completions, /v1/completions, /v1/embeddings, /v1/rerank) contain all required fields and correct data types.
  • Early Validation Integration: Integrated the new body validation logic into the _validate_batch_input_file process to catch malformed requests before job execution, improving error handling efficiency.
  • Multi-Endpoint Test Data Generation: Enhanced test utility functions (generate_batch_input_data) in e2e test files to dynamically create batch input data for all supported API endpoints, facilitating broader test coverage.
  • Parametrized E2E Testing: Introduced a new parametrized end-to-end test (test_openai_batch_api_multi_endpoint) to comprehensively validate the full batch workflow across all four supported endpoint types.
  • Comprehensive Unit Tests: Added a dedicated test class TestEndpointBodyValidation with over 20 unit test cases to cover various scenarios for endpoint-specific body validation, including valid bodies, missing fields, and invalid field types.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • python/aibrix/aibrix/metadata/api/v1/batch.py
    • Added ENDPOINT_REQUIRED_BODY_FIELDS dictionary to define required fields for each batch job endpoint.
    • Introduced _validate_request_body_for_endpoint function to perform detailed validation of request bodies based on the target endpoint, checking for required fields and correct data types (e.g., messages as list, prompt as string/list).
    • Integrated the call to _validate_request_body_for_endpoint within _validate_batch_input_file to apply body validation during batch input file processing.
  • python/aibrix/tests/batch/test_batch_endpoints.py
    • Imported _validate_request_body_for_endpoint for testing.
    • Added TestEndpointBodyValidation class containing numerous unit tests for _validate_request_body_for_endpoint, covering valid and invalid request bodies for chat completions, completions, embeddings, and rerank endpoints, including tests for missing fields and incorrect data types.
    • Included a parametrized test to ensure all supported endpoints accept their respective valid bodies.
  • python/aibrix/tests/e2e/test_batch_api.py
    • Defined ENDPOINT_SAMPLE_BODIES to store valid sample request bodies for each supported endpoint.
    • Modified generate_batch_input_data to accept an endpoint parameter, enabling it to generate batch input data for any specified endpoint.
  • python/aibrix/tests/e2e/test_openai_batch_api.py
    • Defined ENDPOINT_SAMPLE_BODIES to store valid sample request bodies for each supported endpoint.
    • Modified generate_batch_input_data to accept an endpoint parameter, allowing it to generate batch input data for any specified endpoint.
    • Added test_openai_batch_api_multi_endpoint, a parametrized asynchronous test that executes the full batch API workflow (upload, create, poll, verify) for each of the four supported endpoints.
Activity
  • No specific activity (comments, reviews, etc.) was provided in the context for this pull request.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request significantly improves the batch processing system by introducing robust body validation for API endpoints using the new _validate_request_body_for_endpoint function, along with comprehensive unit and end-to-end tests. However, a critical security flaw allows attackers to bypass this validation by providing a full URL in the input file instead of just the path, potentially leading to malformed requests reaching the inference engine. To address this, it is crucial to use the validated endpoint path from the batch request for body validation.

Comment on lines +196 to +197
request["body"], request_url, line_num
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-medium medium

The _validate_batch_input_file function passes the untrusted request_url from the input file to _validate_request_body_for_endpoint instead of using the validated endpoint from the batch request. Since _validate_batch_input_file only checks if request_url ends with the expected endpoint (line 187), an attacker can provide a full URL (e.g., http://attacker.com/v1/chat/completions) that passes the check but fails to match any key in ENDPOINT_REQUIRED_BODY_FIELDS (which contains only paths like /v1/chat/completions). This causes _validate_request_body_for_endpoint to skip validation (line 74), allowing malformed request bodies to bypass the early validation check.

            body_error = _validate_request_body_for_endpoint(
                request["body"], endpoint, line_num
            )

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Expands the batch API to validate request bodies per endpoint and broadens automated coverage to exercise all supported batch endpoints end-to-end.

Changes:

  • Added endpoint-specific request body validation during batch input file validation.
  • Added multi-endpoint batch input generators for e2e tests.
  • Added parametrized e2e test and new unit tests covering endpoint-specific body validation.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.

File Description
python/aibrix/aibrix/metadata/api/v1/batch.py Adds endpoint-aware request body validation during batch input parsing.
python/aibrix/tests/batch/test_e2e_openai_batch_api.py Adds multi-endpoint generators and a parametrized e2e workflow test across endpoints.
python/aibrix/tests/e2e/test_batch_api.py Generalizes batch input generator to support multiple endpoints.
python/aibrix/tests/batch/test_batch_endpoints.py Adds unit tests for endpoint-specific body validation logic.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +562 to +570
# Step 4: Download and verify output
output_response = client.get(f"/v1/files/{output_file_id}/content")
assert output_response.status_code == 200

output_content = output_response.content.decode("utf-8")
assert output_content, f"[{endpoint}] Output file is empty"
assert verify_batch_output_content(output_content, num_requests), (
f"[{endpoint}] Output verification failed"
)
Copy link

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

verify_batch_output_content() (per provided context) asserts response.body contains ["model", "choices"], which will not match typical responses for /v1/embeddings (usually data) or /v1/rerank (usually results). This makes the new parametrized test fail for non-chat endpoints. Update verification to be endpoint-aware (e.g., branch assertions by endpoint), or generalize the verifier to validate only common fields plus endpoint-specific payload shape.

Copilot uses AI. Check for mistakes.
Comment on lines +71 to +74
required_fields = ENDPOINT_REQUIRED_BODY_FIELDS.get(endpoint)
if required_fields is None:
# Unknown endpoint, skip body validation
return None
Copy link

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Body validation is skipped entirely when endpoint doesn’t exactly match a key in ENDPOINT_REQUIRED_BODY_FIELDS. Since the value passed in comes from the batch input line (request_url), any benign variation (e.g., trailing slash, query string) would bypass validation. Consider normalizing the endpoint before lookup (e.g., strip query/fragment, strip trailing /) and/or validating against the canonical batch job endpoint (the endpoint parameter passed to _validate_batch_input_file) rather than the per-line URL string.

Copilot uses AI. Check for mistakes.
Comment on lines +20 to 22
from aibrix.metadata.api.v1.batch import _validate_request_body_for_endpoint


Copy link

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests are importing and depending on a private function (_validate_request_body_for_endpoint). This increases coupling and makes refactors harder (renames/moves break tests even if behavior is unchanged). Consider promoting this validator to a non-underscored helper (or moving it into a small shared validation module) and importing that public surface from tests.

Suggested change
from aibrix.metadata.api.v1.batch import _validate_request_body_for_endpoint
import aibrix.metadata.api.v1.batch as batch_module
_validate_request_body_for_endpoint = batch_module._validate_request_body_for_endpoint

Copilot uses AI. Check for mistakes.
Comment on lines +86 to +99
return f"Line {line_num}: 'messages' must be an array for {endpoint}"
elif endpoint == BatchJobEndpoint.COMPLETIONS.value:
prompt = body.get("prompt")
if not isinstance(prompt, (str, list)):
return f"Line {line_num}: 'prompt' must be a string or array for {endpoint}"
elif endpoint == BatchJobEndpoint.EMBEDDINGS.value:
input_val = body.get("input")
if not isinstance(input_val, (str, list)):
return f"Line {line_num}: 'input' must be a string or array for {endpoint}"
elif endpoint == BatchJobEndpoint.RERANK.value:
if not isinstance(body.get("query"), str):
return f"Line {line_num}: 'query' must be a string for {endpoint}"
if not isinstance(body.get("documents"), list):
return f"Line {line_num}: 'documents' must be an array for {endpoint}"
Copy link

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error messages use “array”, which is more JavaScript/JSON terminology; in Python-facing validation errors, “list” is usually clearer (especially since the check is isinstance(..., list)). Consider rewording to “list” (and similarly for other messages) to make failures more actionable/debuggable.

Suggested change
return f"Line {line_num}: 'messages' must be an array for {endpoint}"
elif endpoint == BatchJobEndpoint.COMPLETIONS.value:
prompt = body.get("prompt")
if not isinstance(prompt, (str, list)):
return f"Line {line_num}: 'prompt' must be a string or array for {endpoint}"
elif endpoint == BatchJobEndpoint.EMBEDDINGS.value:
input_val = body.get("input")
if not isinstance(input_val, (str, list)):
return f"Line {line_num}: 'input' must be a string or array for {endpoint}"
elif endpoint == BatchJobEndpoint.RERANK.value:
if not isinstance(body.get("query"), str):
return f"Line {line_num}: 'query' must be a string for {endpoint}"
if not isinstance(body.get("documents"), list):
return f"Line {line_num}: 'documents' must be an array for {endpoint}"
return f"Line {line_num}: 'messages' must be a list for {endpoint}"
elif endpoint == BatchJobEndpoint.COMPLETIONS.value:
prompt = body.get("prompt")
if not isinstance(prompt, (str, list)):
return f"Line {line_num}: 'prompt' must be a string or list for {endpoint}"
elif endpoint == BatchJobEndpoint.EMBEDDINGS.value:
input_val = body.get("input")
if not isinstance(input_val, (str, list)):
return f"Line {line_num}: 'input' must be a string or list for {endpoint}"
elif endpoint == BatchJobEndpoint.RERANK.value:
if not isinstance(body.get("query"), str):
return f"Line {line_num}: 'query' must be a string for {endpoint}"
if not isinstance(body.get("documents"), list):
return f"Line {line_num}: 'documents' must be a list for {endpoint}"

Copilot uses AI. Check for mistakes.
@varungup90
Copy link
Collaborator

Can you sign the commit and help address copilot comments.

@varungup90 varungup90 requested a review from zhangjyr March 5, 2026 23:25
Copy link
Collaborator

@zhangjyr zhangjyr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am fine with the validation improvement. @NJX-njx Can you fix DCO and Gemini comments?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[batch] Currently only /v1/chat/completions is tested. We need to expand to /v1/completions or /v1/embeddings and /v1/moderation at least

4 participants