Skip to content

Commit 8bbbd52

Browse files
committed
refactor: Make base client concrete and usable
The openAIModelServerClient could not be instantiated directly as it declared no supported APIs. While this may have been intended to enforce it as a base class, making it concrete provides more flexibility. This change allows the client to be used with any generic OpenAI-compatible endpoint. It also centralizes the API list so redundant overrides can be removed from the vLLM, TGI, and SGLang subclasses, improving maintainability.
1 parent 86b7ffb commit 8bbbd52

File tree

4 files changed

+1
-9
lines changed

4 files changed

+1
-9
lines changed

inference_perf/client/modelserver/openai_client.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -130,7 +130,7 @@ async def process_request(self, data: InferenceAPIData, stage_id: int, scheduled
130130
)
131131

132132
def get_supported_apis(self) -> List[APIType]:
133-
return []
133+
return [APIType.Completion, APIType.Chat]
134134

135135
@abstractmethod
136136
def get_prometheus_metric_metadata(self) -> PrometheusMetricMetadata:

inference_perf/client/modelserver/sglang_client.py

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -48,8 +48,6 @@ def __init__(
4848
)
4949
self.metric_filters = [f"model_name='{model_name}'", *additional_filters]
5050

51-
def get_supported_apis(self) -> List[APIType]:
52-
return [APIType.Completion, APIType.Chat]
5351

5452
def get_prometheus_metric_metadata(self) -> PrometheusMetricMetadata:
5553
return PrometheusMetricMetadata(

inference_perf/client/modelserver/tgi_client.py

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -48,9 +48,6 @@ def __init__(
4848
)
4949
self.metric_filters = additional_filters
5050

51-
def get_supported_apis(self) -> List[APIType]:
52-
return [APIType.Completion, APIType.Chat]
53-
5451
def get_prometheus_metric_metadata(self) -> PrometheusMetricMetadata:
5552
return PrometheusMetricMetadata(
5653
avg_queue_length=ModelServerPrometheusMetric(

inference_perf/client/modelserver/vllm_client.py

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -48,9 +48,6 @@ def __init__(
4848
)
4949
self.metric_filters = [f"model_name='{model_name}'", *additional_filters]
5050

51-
def get_supported_apis(self) -> List[APIType]:
52-
return [APIType.Completion, APIType.Chat]
53-
5451
def get_prometheus_metric_metadata(self) -> PrometheusMetricMetadata:
5552
return PrometheusMetricMetadata(
5653
avg_queue_length=ModelServerPrometheusMetric(

0 commit comments

Comments
 (0)