Skip to content

.Net Bug: Function calling does not work properly with Llama series models when hosted on Azure. #10221

Closed
@bauann

Description

Describe the bug
I deployed 3 different models in Azure AI (Machine Learning Studio) serverless endpoint.

  • Mistral-Nemo
  • Llama-3.3-70B-Instruct
  • Llama-3.2-90B-Vision-Instruct
    And use SemanticKernel package 1.33.0 + Microsoft.SemanticKernel.Connectors.AzureAIInference 1.33.0-beta to test, The Mistral-Nemo model works perfect. But not in llama series models.

To Reproduce
Steps to reproduce the behavior:

  1. Depoly open source model (e.g llama-3.3-70B) to serverless endpoin.
  2. Make a GetChatMessageContentsAsync call with tools

Expected behavior
Models like Llama 3.3 and Llama 3.2 should support tool calling and are expected to function as they do when hosted in Ollama.

Platform

  • OS: Mac
  • IDE: Rider
  • Language: C#
  • Source: SK 1.33.0, Microsoft.SemanticKernel.Connectors.AzureAIInference 1.33.0-beta

Additional context
With llama-3.3-70B, When make a chatcomplecton call with tools, It responded with an incorrect result and not invoke any tools. no matter use GetChatMessageContentsAsync or GetStreamingChatMessageContentsAsync.

With llama-3.2-90B, When make a chatcomplecton call with tools, It throws an exception right away (error message in bwlow)

Azure.RequestFailedException: {"object":"error","message":"\"auto\" tool choice requires --enable-auto-tool-choice and --tool-call-parser to be set","type":"BadRequestError","param":null,"code":400}
Status: 400 (Bad Request)
ErrorCode: Bad Request

Content:
{"error":{"code":"Bad Request","message":"{\"object\":\"error\",\"message\":\"\\\"auto\\\" tool choice requires --enable-auto-tool-choice and --tool-call-parser to be set\",\"type\":\"BadRequestError\",\"param\":null,\"code\":400}","status":400}}

Headers:
x-ms-rai-invoked: REDACTED
x-envoy-upstream-service-time: REDACTED
X-Request-ID: REDACTED
ms-azureml-model-error-reason: REDACTED
ms-azureml-model-error-statuscode: REDACTED
ms-azureml-model-time: REDACTED
azureml-destination-model-group: REDACTED
azureml-destination-region: REDACTED
azureml-destination-deployment: REDACTED
azureml-destination-endpoint: REDACTED
x-ms-client-request-id: 42bc2161-5a3f-4e88-8c9e-577390db941e
Request-Context: REDACTED
azureml-model-session: REDACTED
azureml-model-group: REDACTED
Date: Wed, 15 Jan 2025 05:46:51 GMT
Content-Length: 246
Content-Type: application/json

   at Azure.Core.HttpPipelineExtensions.ProcessMessageAsync(HttpPipeline pipeline, HttpMessage message, RequestContext requestContext, CancellationToken cancellationToken)
   at Azure.AI.Inference.ChatCompletionsClient.CompleteAsync(RequestContent content, String extraParams, RequestContext context)
   at Azure.AI.Inference.ChatCompletionsClient.CompleteAsync(ChatCompletionsOptions chatCompletionsOptions, CancellationToken cancellationToken)
   at Microsoft.Extensions.AI.AzureAIInferenceChatClient.CompleteAsync(IList`1 chatMessages, ChatOptions options, CancellationToken cancellationToken)
   at Microsoft.Extensions.AI.FunctionInvokingChatClient.CompleteAsync(IList`1 chatMessages, ChatOptions options, CancellationToken cancellationToken)
   at Microsoft.SemanticKernel.ChatCompletion.ChatClientChatCompletionService.GetChatMessageContentsAsync(ChatHistory chatHistory, PromptExecutionSettings executionSettings, Kernel kernel, CancellationToken cancellationToken)

Metadata

Assignees

Labels

.NETIssue or Pull requests regarding .NET codebugSomething isn't working

Type

Projects

  • Status

    Sprint: Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions