Publish TPS/throughput specs for our shared open-source models

Source: Cameron, #private-inference, May 19. Cursor (and likely other partners) ask for our TPS / throughput numbers on shared open-source models before they'll route traffic to us.

## What

- Publish per-model TPS, tokens/sec/card, TTFT, ITL for the models we serve (GLM-5.1, Qwen3.5-122B, Qwen3.6-35B, gpt-oss-120b, Gemma-4-31B, …)
- Reproducible methodology so partners can verify
- Comparable baseline (e.g. Scaleway, Together, Fireworks)

## Where

- Ideally a public page or PDF we can hand to partners
- genai-benchmark already produces the numbers — gap is the publication artifact

## Related

- nearai/infra#127 (Qwen 3.6 perf vs Scaleway — overlapping methodology)
- Lloyd's existing Scaleway report: [Notion](https://www.notion.so/jasnahcom/Near-vs-Scaleway-benchmark-report-36129a6526bf80689f7ee967a9c3bc5f)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Publish TPS/throughput specs for our shared open-source models #10

What

Where

Related

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Publish TPS/throughput specs for our shared open-source models #10

Description

What

Where

Related

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions