feat(together-ai): update model YAMLs [bot] by models-bot[bot] · Pull Request #1016 · truefoundry/models

models-bot · 2026-05-13T02:23:33Z

Auto-generated by poc-agent for provider together-ai.

Note

Low Risk
Low risk metadata-only updates to Together.ai model YAMLs; main risk is incorrect capabilities (modalities/context window/features) causing misrouting or failed requests.

Overview
Updates Together.ai’s NVIDIA Nemotron model YAMLs to better describe capabilities and availability.

NVIDIA-Nemotron-3-Super-120B-A12B-BF16 now declares function_calling/system_messages, adds source URLs, and flags thinking.

nemotron-3-nano-omni-30b-a3b-reasoning-fp8 is changed from unknown to chat, increases context_window to 256000, declares multimodal inputs, and adds provisioning/source/status metadata plus thinking.

^{Reviewed by Cursor Bugbot for commit f391da8. Bugbot is set up for automated code reviews on this repo. Configure here.}

github-actions · 2026-05-13T02:23:39Z

/test-models

harshiv-26 · 2026-05-13T02:25:09Z

Gateway test results

Total: 7
Passed: 0
Failed: 6
Validation failed: 0
Errored: 0
Skipped: 1
Success rate: 0.0%

Provider	Model	Scenarios
`together-ai`	`nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-BF16`	skipped: skip-check
`together-ai`	`nvidia/nemotron-3-nano-omni-30b-a3b-reasoning-fp8`	failure: reasoning, params:stream, params, tool-call, reasoning:stream, tool-call:stream

Failures (6)

together-ai/nvidia/nemotron-3-nano-omni-30b-a3b-reasoning-fp8 — reasoning (failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmpw92su60d/snippet.py", line 5, in <module>
    response = client.chat.completions.create(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_utils/_utils.py", line 286, in wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/resources/chat/completions/completions.py", line 1147, in create
    return self._post(
           ^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_base_client.py", line 1259, in post
    return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_base_client.py", line 1047, in request
    raise self._make_status_error_from_response(err.response) from None
openai.BadRequestError: Error code: 400 - {'status': 'failure', 'message': 'together-ai error: Unable to access non-serverless model nvidia/nemotron-3-nano-omni-30b-a3b-reasoning-fp8. Please visit https://api.together.ai/models/nvidia/nemotron-3-nano-omni-30b-a3b-reasoning-fp8 to create and start a new dedicated endpoint for the model.', 'error': {'message': 'together-ai error: Unable to access non-serverless model nvidia/nemotron-3-nano-omni-30b-a3b-reasoning-fp8. Please visit https://api.together.ai/models/nvidia/nemotron-3-nano-omni-30b-a3b-reasoning-fp8 to create and start a new dedicated endpoint for the model.', 'type': 'APIError', 'code': '400'}, 'error_origin_level': 'api_error', 'provider': 'together-ai'}

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

response = client.chat.completions.create(
    model="test-v2-together-ai/nvidia-nemotron-3-nano-omni-30b-a3b-reasoning-fp8",
    messages=[
        {"role": "user", "content": "How to calculate 3^3^3^3? Think step by step and show all reasoning."},
    ],
    reasoning_effort="medium",
    stream=False,
)

_usage = getattr(response, "usage", None)
_reasoning_detected = False

_choices = getattr(response, "choices", None)
if _choices and len(_choices) > 0:
    _message = getattr(_choices[0], "message", None)
else:
    _message = None

if _message and getattr(_message, "content", None) is not None:
    print(_message.content)

if _usage is not None:
    _output_token_details = getattr(_usage, "completion_tokens_details", None)
    if _output_token_details and getattr(_output_token_details, "reasoning_tokens", 0) > 0:
        _reasoning_detected = True
    elif getattr(_usage, "reasoning", None) is not None:
        _reasoning_detected = True

if getattr(_message, "reasoning_content", None) is not None:
    _reasoning_detected = True
elif getattr(_message, "reasoning", None) is not None:
    _reasoning_detected = True

if not _reasoning_detected:
    print("Response: ", response)
    raise Exception("VALIDATION FAILED: reasoning - no reasoning information in response")
print("VALIDATION: reasoning SUCCESS")

together-ai/nvidia/nemotron-3-nano-omni-30b-a3b-reasoning-fp8 — params:stream (failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmpuknawst9/snippet.py", line 5, in <module>
    response = client.chat.completions.create(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_utils/_utils.py", line 286, in wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/resources/chat/completions/completions.py", line 1147, in create
    return self._post(
           ^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_base_client.py", line 1259, in post
    return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_base_client.py", line 1047, in request
    raise self._make_status_error_from_response(err.response) from None
openai.BadRequestError: Error code: 400 - {'status': 'failure', 'message': 'together-ai error: Unable to access non-serverless model nvidia/nemotron-3-nano-omni-30b-a3b-reasoning-fp8. Please visit https://api.together.ai/models/nvidia/nemotron-3-nano-omni-30b-a3b-reasoning-fp8 to create and start a new dedicated endpoint for the model.', 'error': {'message': 'together-ai error: Unable to access non-serverless model nvidia/nemotron-3-nano-omni-30b-a3b-reasoning-fp8. Please visit https://api.together.ai/models/nvidia/nemotron-3-nano-omni-30b-a3b-reasoning-fp8 to create and start a new dedicated endpoint for the model.', 'type': 'APIError', 'code': '400'}, 'error_origin_level': 'api_error', 'provider': 'together-ai'}

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

response = client.chat.completions.create(
    model="test-v2-together-ai/nvidia-nemotron-3-nano-omni-30b-a3b-reasoning-fp8",
    messages=[
        {"role": "user", "content": "What is the capital of France?"},
    ],
    stream=True,
)

for chunk in response:
    if chunk.choices and len(chunk.choices) > 0:
        delta = chunk.choices[0].delta
        if delta.content is not None:
            print(delta.content, end="", flush=True)

together-ai/nvidia/nemotron-3-nano-omni-30b-a3b-reasoning-fp8 — params (failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmploe9ury_/snippet.py", line 5, in <module>
    response = client.chat.completions.create(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_utils/_utils.py", line 286, in wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/resources/chat/completions/completions.py", line 1147, in create
    return self._post(
           ^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_base_client.py", line 1259, in post
    return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_base_client.py", line 1047, in request
    raise self._make_status_error_from_response(err.response) from None
openai.BadRequestError: Error code: 400 - {'status': 'failure', 'message': 'together-ai error: Unable to access non-serverless model nvidia/nemotron-3-nano-omni-30b-a3b-reasoning-fp8. Please visit https://api.together.ai/models/nvidia/nemotron-3-nano-omni-30b-a3b-reasoning-fp8 to create and start a new dedicated endpoint for the model.', 'error': {'message': 'together-ai error: Unable to access non-serverless model nvidia/nemotron-3-nano-omni-30b-a3b-reasoning-fp8. Please visit https://api.together.ai/models/nvidia/nemotron-3-nano-omni-30b-a3b-reasoning-fp8 to create and start a new dedicated endpoint for the model.', 'type': 'APIError', 'code': '400'}, 'error_origin_level': 'api_error', 'provider': 'together-ai'}

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

response = client.chat.completions.create(
    model="test-v2-together-ai/nvidia-nemotron-3-nano-omni-30b-a3b-reasoning-fp8",
    messages=[
        {"role": "user", "content": "What is the capital of France?"},
    ],
    stream=False,
)

print(response.choices[0].message.content)

together-ai/nvidia/nemotron-3-nano-omni-30b-a3b-reasoning-fp8 — tool-call (failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmp1d5p2ma9/snippet.py", line 27, in <module>
    response = client.chat.completions.create(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_utils/_utils.py", line 286, in wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/resources/chat/completions/completions.py", line 1147, in create
    return self._post(
           ^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_base_client.py", line 1259, in post
    return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_base_client.py", line 1047, in request
    raise self._make_status_error_from_response(err.response) from None
openai.BadRequestError: Error code: 400 - {'status': 'failure', 'message': 'together-ai error: Unable to access non-serverless model nvidia/nemotron-3-nano-omni-30b-a3b-reasoning-fp8. Please visit https://api.together.ai/models/nvidia/nemotron-3-nano-omni-30b-a3b-reasoning-fp8 to create and start a new dedicated endpoint for the model.', 'error': {'message': 'together-ai error: Unable to access non-serverless model nvidia/nemotron-3-nano-omni-30b-a3b-reasoning-fp8. Please visit https://api.together.ai/models/nvidia/nemotron-3-nano-omni-30b-a3b-reasoning-fp8 to create and start a new dedicated endpoint for the model.', 'type': 'APIError', 'code': '400'}, 'error_origin_level': 'api_error', 'provider': 'together-ai'}

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get the current weather for a location.",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The city name, e.g. London",
                    },
                },
                "required": ["location"],
                "additionalProperties": False,
            },
            "strict": True,
        },
    },
]

response = client.chat.completions.create(
    model="test-v2-together-ai/nvidia-nemotron-3-nano-omni-30b-a3b-reasoning-fp8",
    messages=[
        {"role": "user", "content": "Use the get_weather tool to check the weather in London. You must call the tool, do not respond with plain text."},
    ],
    tools=tools,
    tool_choice="auto",
    stream=False,
)

_message = response.choices[0].message
if _message.tool_calls:
    for _tc in _message.tool_calls:
        print(f"Function: {_tc.function.name}")
        print(f"Arguments: {_tc.function.arguments}")
else:
    print(_message.content)

if not _message.tool_calls or len(_message.tool_calls) == 0:
    raise Exception("VALIDATION FAILED: tool-call - no tool calls in response")
print("VALIDATION: tool-call SUCCESS")

together-ai/nvidia/nemotron-3-nano-omni-30b-a3b-reasoning-fp8 — reasoning:stream (failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmpf9rdtpag/snippet.py", line 5, in <module>
    response = client.chat.completions.create(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_utils/_utils.py", line 286, in wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/resources/chat/completions/completions.py", line 1147, in create
    return self._post(
           ^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_base_client.py", line 1259, in post
    return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_base_client.py", line 1047, in request
    raise self._make_status_error_from_response(err.response) from None
openai.BadRequestError: Error code: 400 - {'status': 'failure', 'message': 'together-ai error: Unable to access non-serverless model nvidia/nemotron-3-nano-omni-30b-a3b-reasoning-fp8. Please visit https://api.together.ai/models/nvidia/nemotron-3-nano-omni-30b-a3b-reasoning-fp8 to create and start a new dedicated endpoint for the model.', 'error': {'message': 'together-ai error: Unable to access non-serverless model nvidia/nemotron-3-nano-omni-30b-a3b-reasoning-fp8. Please visit https://api.together.ai/models/nvidia/nemotron-3-nano-omni-30b-a3b-reasoning-fp8 to create and start a new dedicated endpoint for the model.', 'type': 'APIError', 'code': '400'}, 'error_origin_level': 'api_error', 'provider': 'together-ai'}

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

response = client.chat.completions.create(
    model="test-v2-together-ai/nvidia-nemotron-3-nano-omni-30b-a3b-reasoning-fp8",
    messages=[
        {"role": "user", "content": "How to calculate 3^3^3^3? Think step by step and show all reasoning."},
    ],
    reasoning_effort="medium",
    stream=True,
)

_reasoning_detected = False
for chunk in response:
    if chunk.choices and len(chunk.choices) > 0:
        delta = chunk.choices[0].delta
        if delta.content is not None:
            print(delta.content, end="", flush=True)
        if getattr(delta, "reasoning_content", None) is not None:
            _reasoning_detected = True
        if getattr(delta, "reasoning", None) is not None:
            _reasoning_detected = True

    _usage = getattr(chunk, "usage", None)
    if _usage is not None:
        _details = getattr(_usage, "completion_tokens_details", None)
        if _details and getattr(_details, "reasoning_tokens", 0) > 0:
            _reasoning_detected = True

if not _reasoning_detected:
    raise Exception("VALIDATION FAILED: reasoning stream - no reasoning information in stream")
print("\nVALIDATION: reasoning stream SUCCESS")

together-ai/nvidia/nemotron-3-nano-omni-30b-a3b-reasoning-fp8 — tool-call:stream (failure)

Error:

Traceback (most recent call last):
  File "/tmp/tmpr6lfrwq4/snippet.py", line 27, in <module>
    response = client.chat.completions.create(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_utils/_utils.py", line 286, in wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/resources/chat/completions/completions.py", line 1147, in create
    return self._post(
           ^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_base_client.py", line 1259, in post
    return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_base_client.py", line 1047, in request
    raise self._make_status_error_from_response(err.response) from None
openai.BadRequestError: Error code: 400 - {'status': 'failure', 'message': 'together-ai error: Unable to access non-serverless model nvidia/nemotron-3-nano-omni-30b-a3b-reasoning-fp8. Please visit https://api.together.ai/models/nvidia/nemotron-3-nano-omni-30b-a3b-reasoning-fp8 to create and start a new dedicated endpoint for the model.', 'error': {'message': 'together-ai error: Unable to access non-serverless model nvidia/nemotron-3-nano-omni-30b-a3b-reasoning-fp8. Please visit https://api.together.ai/models/nvidia/nemotron-3-nano-omni-30b-a3b-reasoning-fp8 to create and start a new dedicated endpoint for the model.', 'type': 'APIError', 'code': '400'}, 'error_origin_level': 'api_error', 'provider': 'together-ai'}

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get the current weather for a location.",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The city name, e.g. London",
                    },
                },
                "required": ["location"],
                "additionalProperties": False,
            },
            "strict": True,
        },
    },
]

response = client.chat.completions.create(
    model="test-v2-together-ai/nvidia-nemotron-3-nano-omni-30b-a3b-reasoning-fp8",
    messages=[
        {"role": "user", "content": "Use the get_weather tool to check the weather in London. You must call the tool, do not respond with plain text."},
    ],
    tools=tools,
    tool_choice="auto",
    stream=True,
)

_tool_calls_made = False
for chunk in response:
    if chunk.choices and len(chunk.choices) > 0:
        delta = chunk.choices[0].delta
        if delta.content is not None:
            print(delta.content, end="", flush=True)
        if delta.tool_calls:
            _tool_calls_made = True
            for _tc in delta.tool_calls:
                if _tc.function:
                    print(_tc.function.arguments or "", end="", flush=True)

if not _tool_calls_made:
    raise Exception("VALIDATION FAILED: tool-call stream - no tool calls received")
print("\nVALIDATION: tool-call stream SUCCESS")

Skipped (1)

together-ai/nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-BF16 — skip-check (skipped)

Skip reason:

Provisioned model

github-actions · 2026-05-13T12:46:16Z

/test-models

github-actions · 2026-05-13T12:46:33Z

/test-models

harshiv-26 · 2026-05-13T12:47:46Z

Gateway test results

Total: 2
Passed: 0
Failed: 0
Validation failed: 0
Errored: 0
Skipped: 2
Success rate: 0.0%

Provider	Model	Scenarios
`together-ai`	`nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-BF16`	skipped: skip-check
`together-ai`	`nvidia/nemotron-3-nano-omni-30b-a3b-reasoning-fp8`	skipped: skip-check

Skipped (2)

together-ai/nvidia/nemotron-3-nano-omni-30b-a3b-reasoning-fp8 — skip-check (skipped)

Skip reason:

Provisioned model

together-ai/nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-BF16 — skip-check (skipped)

Skip reason:

Provisioned model

harshiv-26 · 2026-05-13T12:48:24Z

Gateway test results

Total: 2
Passed: 0
Failed: 0
Validation failed: 0
Errored: 0
Skipped: 2
Success rate: 0.0%

Provider	Model	Scenarios
`together-ai`	`nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-BF16`	skipped: skip-check
`together-ai`	`nvidia/nemotron-3-nano-omni-30b-a3b-reasoning-fp8`	skipped: skip-check

Skipped (2)

together-ai/nvidia/nemotron-3-nano-omni-30b-a3b-reasoning-fp8 — skip-check (skipped)

Skip reason:

Provisioned model

together-ai/nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-BF16 — skip-check (skipped)

Skip reason:

Provisioned model

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

^{Reviewed by Cursor Bugbot for commit f391da8. Configure here.}

cursor · 2026-05-13T12:53:16Z

      region: "*"
+features:
+    - function_calling
+    - system_messages


BF16 model missing features present in equivalent FP8 variant

Medium Severity

The newly added features list for the BF16 variant only includes function_calling and system_messages, while the equivalent FP8 variant (NVIDIA-Nemotron-3-Super-120B-A12B-FP8.yaml) declares function_calling, tool_choice, structured_output, and system_messages. Since both are the same base model at different quantization levels (and both are provisioned), the BF16 variant is likely missing tool_choice and structured_output. This could cause the gateway to incorrectly withhold these capabilities for the BF16 model.

^{Reviewed by Cursor Bugbot for commit f391da8. Configure here.}

feat(together-ai): update model YAMLs [bot]

e463ee3

provision

6af0ab1

harshiv-26 approved these changes May 13, 2026

View reviewed changes

Merge branch 'main' into bot/update-together-ai-20260513-022331

f391da8

harshiv-26 enabled auto-merge (squash) May 13, 2026 12:46

harshiv-26 merged commit cbfa9a0 into main May 13, 2026
8 checks passed

harshiv-26 deleted the bot/update-together-ai-20260513-022331 branch May 13, 2026 12:47

cursor Bot reviewed May 13, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(together-ai): update model YAMLs [bot]#1016

feat(together-ai): update model YAMLs [bot]#1016
harshiv-26 merged 3 commits into
mainfrom
bot/update-together-ai-20260513-022331

models-bot Bot commented May 13, 2026 •

edited by cursor Bot

Loading

Uh oh!

github-actions Bot commented May 13, 2026

Uh oh!

harshiv-26 commented May 13, 2026

Uh oh!

github-actions Bot commented May 13, 2026

Uh oh!

github-actions Bot commented May 13, 2026

Uh oh!

Uh oh!

harshiv-26 commented May 13, 2026

Uh oh!

harshiv-26 commented May 13, 2026

Uh oh!

cursor Bot left a comment

Uh oh!

cursor Bot May 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

models-bot Bot commented May 13, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented May 13, 2026

Uh oh!

harshiv-26 commented May 13, 2026

Gateway test results

Uh oh!

github-actions Bot commented May 13, 2026

Uh oh!

github-actions Bot commented May 13, 2026

Uh oh!

Uh oh!

harshiv-26 commented May 13, 2026

Gateway test results

Uh oh!

harshiv-26 commented May 13, 2026

Gateway test results

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

cursor Bot May 13, 2026

Choose a reason for hiding this comment

BF16 model missing features present in equivalent FP8 variant

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

models-bot Bot commented May 13, 2026 •

edited by cursor Bot

Loading