Problem
Frontier-model providers run their own evaluation suites for marketing claims. They don't currently include ClawBench results. Two reasons:
- We haven't pitched them.
- The path to "run our benchmark" wasn't frictionless. (Now it is —
pip install clawbench-eval is on the homepage.)
A single mention in an Anthropic / OpenAI / Google model launch blog post would 10x our inbound traffic for that quarter.
Action
- Identify contacts at provider evaluator teams (model-card maintainers, internal eval leads):
- Anthropic: model cards maintainers
- OpenAI: evals team
- Google DeepMind: Gemini model-card team
- Alibaba / Z.AI / DeepSeek: open-source evaluator teams
- Pitch packet: 1-paragraph description + 4-stat snapshot (corpus, platforms, top model, two-stage methodology) +
pip install clawbench-eval reproduction line + offer to help integrate
- Track outreach in a
outreach/ directory: who, when, response, follow-up
- Once accepted, write a blog post ("ClawBench in Anthropic's Claude Sonnet 4.X model card") for compounding press
Acceptance
Why now
The publicity flywheel needs external validation, not just on-site polish. Provider mentions are the highest-leverage external validation available — and we've never explicitly tried to land one.
Problem
Frontier-model providers run their own evaluation suites for marketing claims. They don't currently include ClawBench results. Two reasons:
pip install clawbench-evalis on the homepage.)A single mention in an Anthropic / OpenAI / Google model launch blog post would 10x our inbound traffic for that quarter.
Action
pip install clawbench-evalreproduction line + offer to help integrateoutreach/directory: who, when, response, follow-upAcceptance
/blog(publicity: embeddable leaderboard widget — iframe at /embed/leaderboard for blogs + benchmarks aggregators #186)Why now
The publicity flywheel needs external validation, not just on-site polish. Provider mentions are the highest-leverage external validation available — and we've never explicitly tried to land one.