Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 2 additions & 3 deletions specification/protocol/inference_rest.md
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,6 @@ The model ready endpoint returns the readiness probe response for the server alo

$ready_model_response =
{
"name" : $string,
"ready": $bool
}

Expand All @@ -68,7 +67,7 @@ The server ready endpoint returns the readiness probe response for the server.

$ready_server_response =
{
"live" : $bool,
"ready" : $bool
}

---
Expand All @@ -81,7 +80,7 @@ The server live endpoint returns the liveness probe response for the server.

$live_server_response =
{
"live" : $bool,
"live" : $bool
}

---
Expand Down
45 changes: 45 additions & 0 deletions specification/protocol/open_inference_rest.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,10 @@ paths:
responses:
'200':
description: OK
content:
application/json:
schema:
$ref: '#/components/schemas/live_server_response'
operationId: get-v2-health-live
description: The “server live” API indicates if the inference server is able to receive and respond to metadata and inference requests. The “server live” API can be used directly to implement the Kubernetes livenessProbe.
/v2/health/ready:
Expand All @@ -37,6 +41,10 @@ paths:
responses:
'200':
description: OK
content:
application/json:
schema:
$ref: '#/components/schemas/ready_server_response'
operationId: get-v2-health-ready
description: The “server ready” health API indicates if all the models are ready for inferencing. The “server ready” health API can be used directly to implement the Kubernetes readinessProbe.
'/v2/models/${MODEL_NAME}/versions/${MODEL_VERSION}/ready':
Expand All @@ -57,6 +65,10 @@ paths:
responses:
'200':
description: OK
content:
application/json:
schema:
$ref: '#/components/schemas/ready_model_response'
operationId: get-v2-models-$-modelName-versions-$-modelVersion-ready
description: The “model ready” health API indicates if a specific model is ready for inferencing. The model name and (optionally) version must be available in the URL. If a version is not provided the server may choose a version based on its own policies.
/v2/:
Expand Down Expand Up @@ -100,6 +112,12 @@ paths:
application/json:
schema:
$ref: '#/components/schemas/metadata_model_response'
'400':
description: Bad Request
content:
application/json:
schema:
$ref: '#/components/schemas/metadata_model_error_response'
operationId: get-v2-models-$-modelName-versions-$-modelVersion
description: 'The per-model metadata endpoint provides information about a model. A model metadata request is made with an HTTP GET to a model metadata endpoint. In the corresponding response the HTTP body contains the [Model Metadata Response JSON Object](#model-metadata-response-json-object) or the [Model Metadata Response JSON Error Object](#model-metadata-response-json-error-object). The model name and (optionally) version must be available in the URL. If a version is not provided the server may choose a version based on its own policies or return an error.'
'/v2/models/${MODEL_NAME}/versions/${MODEL_VERSION}/infer':
Expand Down Expand Up @@ -138,6 +156,33 @@ paths:
description: 'An inference request is made with an HTTP POST to an inference endpoint. In the request the HTTP body contains the [Inference Request JSON Object](#inference-request-json-object). In the corresponding response the HTTP body contains the [Inference Response JSON Object](#inference-response-json-object) or [Inference Response JSON Error Object](#inference-response-json-error-object). See [Inference Request Examples](#inference-request-examples) for some example HTTP/REST requests and responses.'
components:
schemas:
live_server_response:
title: live_server_response
type: object
description: ''
properties:
live:
type: boolean
required:
- live
ready_server_response:
title: ready_server_response
type: object
description: ''
properties:
ready:
type: boolean
required:
- ready
ready_model_response:
title: ready_model_response
type: object
description: ''
properties:
ready:
type: boolean
required:
- ready
metadata_server_response:
title: metadata_server_response
type: object
Expand Down