.Net Bug: Function calling does not work properly with Llama series models when hosted on Azure.

**Describe the bug**
I deployed 3 different models in Azure AI (Machine Learning Studio) serverless endpoint.
- Mistral-Nemo
- Llama-3.3-70B-Instruct
- Llama-3.2-90B-Vision-Instruct
And use SemanticKernel package 1.33.0 + Microsoft.SemanticKernel.Connectors.AzureAIInference 1.33.0-beta to test, The Mistral-Nemo model works perfect. But not in llama series models.

**To Reproduce**
Steps to reproduce the behavior:
1. Depoly open source model (e.g llama-3.3-70B) to serverless endpoin.
2. Make a GetChatMessageContentsAsync call with tools

**Expected behavior**
Models like Llama 3.3 and Llama 3.2 should support tool calling and are expected to function as they do when hosted in Ollama.

**Platform**
 - OS: Mac
 - IDE: Rider
 - Language: C#
 - Source: SK 1.33.0, Microsoft.SemanticKernel.Connectors.AzureAIInference 1.33.0-beta

**Additional context**
With llama-3.3-70B, When make a chatcomplecton call with tools, It responded with an incorrect result and not invoke any tools. no matter use GetChatMessageContentsAsync or GetStreamingChatMessageContentsAsync.

With llama-3.2-90B, When make a chatcomplecton call with tools, It throws an exception right away (error message in bwlow)
```
Azure.RequestFailedException: {"object":"error","message":"\"auto\" tool choice requires --enable-auto-tool-choice and --tool-call-parser to be set","type":"BadRequestError","param":null,"code":400}
Status: 400 (Bad Request)
ErrorCode: Bad Request

Content:
{"error":{"code":"Bad Request","message":"{\"object\":\"error\",\"message\":\"\\\"auto\\\" tool choice requires --enable-auto-tool-choice and --tool-call-parser to be set\",\"type\":\"BadRequestError\",\"param\":null,\"code\":400}","status":400}}

Headers:
x-ms-rai-invoked: REDACTED
x-envoy-upstream-service-time: REDACTED
X-Request-ID: REDACTED
ms-azureml-model-error-reason: REDACTED
ms-azureml-model-error-statuscode: REDACTED
ms-azureml-model-time: REDACTED
azureml-destination-model-group: REDACTED
azureml-destination-region: REDACTED
azureml-destination-deployment: REDACTED
azureml-destination-endpoint: REDACTED
x-ms-client-request-id: 42bc2161-5a3f-4e88-8c9e-577390db941e
Request-Context: REDACTED
azureml-model-session: REDACTED
azureml-model-group: REDACTED
Date: Wed, 15 Jan 2025 05:46:51 GMT
Content-Length: 246
Content-Type: application/json

   at Azure.Core.HttpPipelineExtensions.ProcessMessageAsync(HttpPipeline pipeline, HttpMessage message, RequestContext requestContext, CancellationToken cancellationToken)
   at Azure.AI.Inference.ChatCompletionsClient.CompleteAsync(RequestContent content, String extraParams, RequestContext context)
   at Azure.AI.Inference.ChatCompletionsClient.CompleteAsync(ChatCompletionsOptions chatCompletionsOptions, CancellationToken cancellationToken)
   at Microsoft.Extensions.AI.AzureAIInferenceChatClient.CompleteAsync(IList`1 chatMessages, ChatOptions options, CancellationToken cancellationToken)
   at Microsoft.Extensions.AI.FunctionInvokingChatClient.CompleteAsync(IList`1 chatMessages, ChatOptions options, CancellationToken cancellationToken)
   at Microsoft.SemanticKernel.ChatCompletion.ChatClientChatCompletionService.GetChatMessageContentsAsync(ChatHistory chatHistory, PromptExecutionSettings executionSettings, Kernel kernel, CancellationToken cancellationToken)
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

.Net Bug: Function calling does not work properly with Llama series models when hosted on Azure. #10221

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

.Net Bug: Function calling does not work properly with Llama series models when hosted on Azure. #10221

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions