Skip to content

TorchServe with Kserve_wrapper v2 throws 'message': 'number of batch response mismatched' #2158

Open
@gavrissh

Description

@gavrissh

🐛 Describe the bug

Torchserve supports batching of multiple requests and batch_size value is provided while registering the model.

Request Envelope receives the input as list of multiple request body but Kserve V2 request envelope picks only the first item in the list of inputs
https://github.com/pytorch/serve/blob/master/ts/torch_handler/request_envelope/kservev2.py#L104

The result being a single output sent back as response causing the mismatch

Error logs

TorchServe Error
stdout MODEL_LOG - model: resnet50-3, number of batch response mismatched, expect: 5, got: 1.

Installation instructions

Followed instructions provided here - https://github.com/pytorch/serve/blob/master/kubernetes/kserve/kserve_wrapper/README.md

Model Packaing

Created a resnet50.mar using default parameters and handler

config.properties

inference_address=http://0.0.0.0:8085/
management_address=http://0.0.0.0:8085/
metrics_address=http://0.0.0.0:8082/
grpc_inference_port=7075
grpc_management_port=7076
enable_envvars_config=true
install_py_dep_per_model=true
enable_metrics_api=true
metrics_format=prometheus
NUM_WORKERS=1
number_of_netty_threads=4
job_queue_size=10
model_store=/mnt/models/model_store
model_snapshot={"name":"startup.cfg","modelCount":1,"models":{"resnet50": {"1.0": {"defaultVersion": true,"marName": "resnet50.mar","minWorkers": 6,"maxWorkers": 6,"batchSize": 16,"maxBatchDelay": 200,"responseTimeout": 2000}}}}

Versions

Name: kserve
Version: 0.10.0

Name: torch
Version: 1.13.1+cu117

Name: torchserve
Version: 0.7.1

Repro instructions

Followed instructions provided here - https://github.com/pytorch/serve/blob/master/kubernetes/kserve/kserve_wrapper/README.md

run the kserve_wrapper main.py and hit multiple curl infer request for v2 protocol

Command used -
seq 1 10 | xargs -n1 -P 5 curl -H "Content-Type: application/json" --data @input_bytes.json http://0.0.0.0:8080/v2/models/resnet50/infer

Possible Solution

Changes required to handle Torchserve batched inputs and generate output for all the requests initiated by TorchServe

Changes are need in parse_input() and format_output() methods in kservev2.py

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions