Support for asynchronous requests for watsonx.ai chat #1666

pawelknes · 2025-03-10T19:08:38Z

Added support for asynchronous requests in the WMLInferenceEngineChat. The default concurrency limit is set to be the same as in the case of the WMLInferenceEngineGeneration: 10.

Small performance test (for the dataset at the bottom, averaged for 3 runs each):

old version: 31.2 seconds
new version (concurrency_limit=10): 6.4 seconds

dataset = load_dataset(
    card="cards.wnli",
    template_card_index=0,
    format="formats.chat_api",
    loader_limit=21,
)["test"]

Signed-off-by: Paweł Knes <[email protected]>

yoavkatz · 2025-03-11T07:16:12Z

Looks good. I've merged to check it in my branch of testing. The unit tests are failing but might not be related. @elronbandel and @eladven are looking at it.

Signed-off-by: Paweł Knes <[email protected]>

pawelknes · 2025-03-11T08:52:37Z

@yoavkatz one error is related to the obsolete version of ibm-watsonx-ai(I've pushed a commit with the upgrade), the other is related to the test_lite_llm_inference_engine (it happened previously as well)

pawelknes · 2025-03-11T10:06:02Z

@yoavkatz one error is related to the obsolete version of ibm-watsonx-ai(I've pushed a commit with the upgrade), the other is related to the test_lite_llm_inference_engine (it happened previously as well)

I see the catalog consistency test failing as well but it seems to be an issue with HF

pawelknes · 2025-03-13T13:01:06Z

@yoavkatz
I see that the inference test keeps failing for LiteLLM tests with:
litellm.exceptions.APIConnectionError: litellm.APIConnectionError: WatsonxException - Client error '400 Bad Request' for url
I saw the same error already happening before, so I don't think it's related to the changes in this PR

support for asynchronous requests in wml chat

d74b7da

Signed-off-by: Paweł Knes <[email protected]>

pawelknes self-assigned this Mar 10, 2025

update ibm-watsonx-ai version

9c1daf0

Signed-off-by: Paweł Knes <[email protected]>

pawelknes added 3 commits March 12, 2025 12:55

Merge branch 'main' into wxai-async-chat

0baf506

Merge branch 'main' into wxai-async-chat

7480581

Merge branch 'main' into wxai-async-chat

6ff1bea

pawelknes and others added 2 commits March 17, 2025 09:55

Merge branch 'main' into wxai-async-chat

42abb9f

Merge branch 'main' into wxai-async-chat

035e8ad

elronbandel approved these changes Mar 19, 2025

View reviewed changes

elronbandel merged commit cdf5b82 into main Mar 19, 2025
21 of 27 checks passed

elronbandel deleted the wxai-async-chat branch March 19, 2025 09:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for asynchronous requests for watsonx.ai chat #1666

Support for asynchronous requests for watsonx.ai chat #1666

pawelknes commented Mar 10, 2025 •

edited

Loading

yoavkatz commented Mar 11, 2025

pawelknes commented Mar 11, 2025 •

edited

Loading

pawelknes commented Mar 11, 2025

pawelknes commented Mar 13, 2025

Support for asynchronous requests for watsonx.ai chat #1666

Support for asynchronous requests for watsonx.ai chat #1666

Conversation

pawelknes commented Mar 10, 2025 • edited Loading

yoavkatz commented Mar 11, 2025

pawelknes commented Mar 11, 2025 • edited Loading

pawelknes commented Mar 11, 2025

pawelknes commented Mar 13, 2025

pawelknes commented Mar 10, 2025 •

edited

Loading

pawelknes commented Mar 11, 2025 •

edited

Loading