Skip to content

feat(deepinfra): add new models [bot]#1009

Merged
harshiv-26 merged 3 commits into
mainfrom
bot/add-deepinfra-20260513-000345
May 13, 2026
Merged

feat(deepinfra): add new models [bot]#1009
harshiv-26 merged 3 commits into
mainfrom
bot/add-deepinfra-20260513-000345

Conversation

@models-bot
Copy link
Copy Markdown
Contributor

@models-bot models-bot Bot commented May 13, 2026

Auto-generated by model-addition-agent for provider deepinfra.


Note

Low Risk
Low risk: this is a data-only addition of a new model YAML entry with no code-path or behavior changes beyond making the model selectable with its declared limits/costs.

Overview
Registers the new DeepInfra model google/gemma-4-31B-it-turbo via a provider YAML, defining its chat mode capabilities (function_calling, json_output), token limits (262k context/max), and per-token pricing metadata.

Reviewed by Cursor Bugbot for commit 96d9850. Bugbot is set up for automated code reviews on this repo. Configure here.

Comment thread providers/deepinfra/google/gemma-4-31B-it-turbo.yaml Outdated
@github-actions
Copy link
Copy Markdown
Contributor

/test-models

Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 56ed71f. Configure here.

Comment thread providers/deepinfra/google/gemma-4-31B-it-turbo.yaml
@harshiv-26
Copy link
Copy Markdown
Collaborator

/test-models

@harshiv-26
Copy link
Copy Markdown
Collaborator

Gateway test results

  • Total: 6
  • Passed: 3
  • Failed: 3
  • Validation failed: 0
  • Errored: 0
  • Skipped: 0
  • Success rate: 50.0%
Provider Model Scenarios
deepinfra google/gemma-4-31B-it-turbo success: params, tool-call:stream, json-output

failure: json-output:stream, tool-call, params:stream
Failures (3)

deepinfra/google/gemma-4-31B-it-turbo — json-output:stream (failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmppll09zid/snippet.py", line 5, in <module>
    response = client.chat.completions.create(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_utils/_utils.py", line 286, in wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/resources/chat/completions/completions.py", line 1147, in create
    return self._post(
           ^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_base_client.py", line 1259, in post
    return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_base_client.py", line 1047, in request
    raise self._make_status_error_from_response(err.response) from None
openai.AuthenticationError: Error code: 401 - {'status': 'failure', 'message': 'Unauthorized: Service account gateway-tester-v2-97cf7528-c does not exist', 'error': {'message': 'Unauthorized: Service account gateway-tester-v2-97cf7528-c does not exist', 'type': 'Error', 'code': '401'}, 'error_origin_level': 'authentication'}
Code snippet
from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

response = client.chat.completions.create(
    model="test-v2-deepinfra/google-gemma-4-31B-it-turbo",
    messages=[
        {"role": "user", "content": "List 3 colors with their hex codes in JSON."},
    ],
    response_format={"type": "json_object"},
    stream=True,
)

import json as _json

_accumulated = ""
for chunk in response:
    if chunk.choices and len(chunk.choices) > 0:
        delta = chunk.choices[0].delta
        if delta.content is not None:
            _accumulated += delta.content
            print(delta.content, end="", flush=True)

if not _accumulated:
    raise Exception("VALIDATION FAILED: json-output stream - no content received")

_json.loads(_accumulated)
print("\nVALIDATION: json-output stream SUCCESS")

deepinfra/google/gemma-4-31B-it-turbo — tool-call (failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmpac7yqp7n/snippet.py", line 27, in <module>
    response = client.chat.completions.create(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_utils/_utils.py", line 286, in wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/resources/chat/completions/completions.py", line 1147, in create
    return self._post(
           ^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_base_client.py", line 1259, in post
    return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_base_client.py", line 1047, in request
    raise self._make_status_error_from_response(err.response) from None
openai.AuthenticationError: Error code: 401 - {'status': 'failure', 'message': 'Unauthorized: Service account gateway-tester-v2-97cf7528-c does not exist', 'error': {'message': 'Unauthorized: Service account gateway-tester-v2-97cf7528-c does not exist', 'type': 'Error', 'code': '401'}, 'error_origin_level': 'authentication'}
Code snippet
from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get the current weather for a location.",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The city name, e.g. London",
                    },
                },
                "required": ["location"],
                "additionalProperties": False,
            },
            "strict": True,
        },
    },
]

response = client.chat.completions.create(
    model="test-v2-deepinfra/google-gemma-4-31B-it-turbo",
    messages=[
        {"role": "user", "content": "Use the get_weather tool to check the weather in London. You must call the tool, do not respond with plain text."},
    ],
    tools=tools,
    tool_choice="auto",
    stream=False,
)

_message = response.choices[0].message
if _message.tool_calls:
    for _tc in _message.tool_calls:
        print(f"Function: {_tc.function.name}")
        print(f"Arguments: {_tc.function.arguments}")
else:
    print(_message.content)

if not _message.tool_calls or len(_message.tool_calls) == 0:
    raise Exception("VALIDATION FAILED: tool-call - no tool calls in response")
print("VALIDATION: tool-call SUCCESS")

deepinfra/google/gemma-4-31B-it-turbo — params:stream (failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmptb39d3ay/snippet.py", line 5, in <module>
    response = client.chat.completions.create(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_utils/_utils.py", line 286, in wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/resources/chat/completions/completions.py", line 1147, in create
    return self._post(
           ^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_base_client.py", line 1259, in post
    return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_base_client.py", line 1047, in request
    raise self._make_status_error_from_response(err.response) from None
openai.AuthenticationError: Error code: 401 - {'status': 'failure', 'message': 'Unauthorized: Service account gateway-tester-v2-97cf7528-c does not exist', 'error': {'message': 'Unauthorized: Service account gateway-tester-v2-97cf7528-c does not exist', 'type': 'Error', 'code': '401'}, 'error_origin_level': 'authentication'}
Code snippet
from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

response = client.chat.completions.create(
    model="test-v2-deepinfra/google-gemma-4-31B-it-turbo",
    messages=[
        {"role": "user", "content": "What is the capital of France?"},
    ],
    stream=True,
)

for chunk in response:
    if chunk.choices and len(chunk.choices) > 0:
        delta = chunk.choices[0].delta
        if delta.content is not None:
            print(delta.content, end="", flush=True)

@harshiv-26
Copy link
Copy Markdown
Collaborator

/test-models

@harshiv-26
Copy link
Copy Markdown
Collaborator

Gateway test results

  • Total: 6
  • Passed: 6
  • Failed: 0
  • Validation failed: 0
  • Errored: 0
  • Skipped: 0
  • Success rate: 100.0%
Provider Model Scenarios
deepinfra google/gemma-4-31B-it-turbo success: params:stream, tool-call:stream, tool-call, json-output, params, json-output:stream

@harshiv-26 harshiv-26 enabled auto-merge (squash) May 13, 2026 11:49
@github-actions
Copy link
Copy Markdown
Contributor

/test-models

@harshiv-26
Copy link
Copy Markdown
Collaborator

Gateway test results

  • Total: 6
  • Passed: 6
  • Failed: 0
  • Validation failed: 0
  • Errored: 0
  • Skipped: 0
  • Success rate: 100.0%
Provider Model Scenarios
deepinfra google/gemma-4-31B-it-turbo success: params, tool-call:stream, params:stream, tool-call, json-output, json-output:stream

@harshiv-26 harshiv-26 merged commit ef586c8 into main May 13, 2026
8 checks passed
@harshiv-26 harshiv-26 deleted the bot/add-deepinfra-20260513-000345 branch May 13, 2026 12:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant