Skip to content

Publish TPS/throughput specs for our shared open-source models #10

@Evrard-Nil

Description

@Evrard-Nil

Source: Cameron, #private-inference, May 19. Cursor (and likely other partners) ask for our TPS / throughput numbers on shared open-source models before they'll route traffic to us.

What

  • Publish per-model TPS, tokens/sec/card, TTFT, ITL for the models we serve (GLM-5.1, Qwen3.5-122B, Qwen3.6-35B, gpt-oss-120b, Gemma-4-31B, …)
  • Reproducible methodology so partners can verify
  • Comparable baseline (e.g. Scaleway, Together, Fireworks)

Where

  • Ideally a public page or PDF we can hand to partners
  • genai-benchmark already produces the numbers — gap is the publication artifact

Related

  • nearai/infra#127 (Qwen 3.6 perf vs Scaleway — overlapping methodology)
  • Lloyd's existing Scaleway report: Notion

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions