[FEATURE] Why not Batch Predict?

**Is your feature request related to a problem?**
[TextSimilarityCrossEncoderModel](https://github.com/opensearch-project/ml-commons/blob/c04f537314e07c52f8d6f7b43ae5cf52756fd9ba/ml-algorithms/src/main/java/org/opensearch/ml/engine/algorithms/text_similarity/TextSimilarityCrossEncoderModel.java#L53) predictions are performed one by one inside a loop using `getPredictor().predict(input)` rather than processing the inputs in batches using `batchPredict(listOfInputs)`

This approach looks inefficient. Is there a specific reason for processing the predictions one by one?

To my understanding, Predictor.java creates internally a list/batch with a single element and calls [batchPredict](https://github.com/deepjavalibrary/djl/blob/32e72d4f4e3e0c291c620d7d6783263340e55b58/api/src/main/java/ai/djl/inference/Predictor.java#L133) anyway.

Similar behaviour is presented on [TextEmbeddingModel](https://github.com/opensearch-project/ml-commons/blob/c04f537314e07c52f8d6f7b43ae5cf52756fd9ba/ml-algorithms/src/main/java/org/opensearch/ml/engine/algorithms/TextEmbeddingModel.java#L49) 

**What solution would you like?**
Use `batchProcessing` to improve efficiency


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[FEATURE] Why not Batch Predict? #4276

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[FEATURE] Why not Batch Predict? #4276

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions