Skip to content

Commit 5a18b6d

Browse files
author
cube-registry-bot
committed
ci: regenerate site [skip ci]
Signed-off-by: cube-registry-bot <cube-registry-bot@users.noreply.github.com>
1 parent 4bf7f63 commit 5a18b6d

5 files changed

Lines changed: 125 additions & 5 deletions

File tree

docs/index.html

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -595,7 +595,7 @@ <h2 class="font-bold text-gray-900 text-base leading-snug">
595595
<p>
596596
CUBE Registry — <a href="https://github.com/The-AI-Alliance/cube-registry" class="underline">GitHub</a> ·
597597
<a href="https://github.com/The-AI-Alliance/cube-standard" class="underline">CUBE Standard</a> ·
598-
Generated 2026-06-01 20:11 UTC
598+
Generated 2026-06-03 18:46 UTC
599599
</p>
600600
<p class="mt-1">License information is self-reported. The AI Alliance makes no warranty as to its accuracy.</p>
601601
</footer>

docs/miniwob/index.html

Lines changed: 31 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -231,6 +231,36 @@ <h2 class="text-lg font-bold mb-4">Legal</h2>
231231
</section>
232232

233233

234+
<!-- ── Community results journal ─────────────────────────────────────── -->
235+
<section class="bg-white rounded-xl border border-gray-200 p-5">
236+
<div class="flex items-baseline justify-between mb-3">
237+
<h2 class="text-lg font-bold">
238+
Reproducibility journal
239+
240+
</h2>
241+
<a href="https://github.com/The-AI-Alliance/cube-registry/blob/main/README.md#submitting-a-result"
242+
class="text-xs text-indigo-600 hover:underline">How to submit →</a>
243+
</div>
244+
245+
<!-- Anti-leaderboard callout ─ the framing matters more than the data. -->
246+
<div class="mb-4 rounded-lg border border-amber-200 bg-amber-50 p-3 text-xs leading-relaxed text-amber-900">
247+
<p class="font-semibold mb-1">This is a reproducibility journal — not a leaderboard.</p>
248+
<p class="text-amber-800">
249+
Submissions document how <em>reference</em> agents and models score over time, across
250+
infrastructures, cube versions, and package versions. Use it to detect drift and validate
251+
environments. <strong>Not</strong> a place to publish a new agent or fine-tune to "win" — there is no
252+
ranking, scores are self-reported, and submissions are unverified. To showcase a new
253+
agent or model, use ATLAS / EEE / your own benchmark page.
254+
</p>
255+
</div>
256+
257+
258+
<p class="text-sm text-gray-400 text-center py-6">
259+
No submissions yet. Be the first — see <a href="https://github.com/The-AI-Alliance/cube-registry/blob/main/README.md#submitting-a-result" class="text-indigo-600 hover:underline">how to submit</a>.
260+
</p>
261+
262+
</section>
263+
234264
<!-- ── Parallelization info ──────────────────────────────────────────── -->
235265

236266
<section class="bg-white rounded-xl border border-gray-200 p-5">
@@ -320,7 +350,7 @@ <h2 class="text-sm font-semibold text-gray-200 tracking-wide uppercase">Registry
320350
<p>
321351
CUBE Registry — <a href="https://github.com/The-AI-Alliance/cube-registry" class="underline">GitHub</a> ·
322352
<a href="../" class="underline">All benchmarks</a> ·
323-
Generated 2026-06-01 20:11 UTC
353+
Generated 2026-06-03 18:46 UTC
324354
</p>
325355
</footer>
326356

docs/osworld/index.html

Lines changed: 31 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -283,6 +283,36 @@ <h2 class="text-lg font-bold mb-4">Legal</h2>
283283
</section>
284284

285285

286+
<!-- ── Community results journal ─────────────────────────────────────── -->
287+
<section class="bg-white rounded-xl border border-gray-200 p-5">
288+
<div class="flex items-baseline justify-between mb-3">
289+
<h2 class="text-lg font-bold">
290+
Reproducibility journal
291+
292+
</h2>
293+
<a href="https://github.com/The-AI-Alliance/cube-registry/blob/main/README.md#submitting-a-result"
294+
class="text-xs text-indigo-600 hover:underline">How to submit →</a>
295+
</div>
296+
297+
<!-- Anti-leaderboard callout ─ the framing matters more than the data. -->
298+
<div class="mb-4 rounded-lg border border-amber-200 bg-amber-50 p-3 text-xs leading-relaxed text-amber-900">
299+
<p class="font-semibold mb-1">This is a reproducibility journal — not a leaderboard.</p>
300+
<p class="text-amber-800">
301+
Submissions document how <em>reference</em> agents and models score over time, across
302+
infrastructures, cube versions, and package versions. Use it to detect drift and validate
303+
environments. <strong>Not</strong> a place to publish a new agent or fine-tune to "win" — there is no
304+
ranking, scores are self-reported, and submissions are unverified. To showcase a new
305+
agent or model, use ATLAS / EEE / your own benchmark page.
306+
</p>
307+
</div>
308+
309+
310+
<p class="text-sm text-gray-400 text-center py-6">
311+
No submissions yet. Be the first — see <a href="https://github.com/The-AI-Alliance/cube-registry/blob/main/README.md#submitting-a-result" class="text-indigo-600 hover:underline">how to submit</a>.
312+
</p>
313+
314+
</section>
315+
286316
<!-- ── Parallelization info ──────────────────────────────────────────── -->
287317

288318
<section class="bg-white rounded-xl border border-gray-200 p-5">
@@ -394,7 +424,7 @@ <h2 class="text-sm font-semibold text-gray-200 tracking-wide uppercase">Registry
394424
<p>
395425
CUBE Registry — <a href="https://github.com/The-AI-Alliance/cube-registry" class="underline">GitHub</a> ·
396426
<a href="../" class="underline">All benchmarks</a> ·
397-
Generated 2026-06-01 20:11 UTC
427+
Generated 2026-06-03 18:46 UTC
398428
</p>
399429
</footer>
400430

docs/webarena-verified/index.html

Lines changed: 31 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -248,6 +248,36 @@ <h2 class="text-lg font-bold mb-4">Legal</h2>
248248
</section>
249249

250250

251+
<!-- ── Community results journal ─────────────────────────────────────── -->
252+
<section class="bg-white rounded-xl border border-gray-200 p-5">
253+
<div class="flex items-baseline justify-between mb-3">
254+
<h2 class="text-lg font-bold">
255+
Reproducibility journal
256+
257+
</h2>
258+
<a href="https://github.com/The-AI-Alliance/cube-registry/blob/main/README.md#submitting-a-result"
259+
class="text-xs text-indigo-600 hover:underline">How to submit →</a>
260+
</div>
261+
262+
<!-- Anti-leaderboard callout ─ the framing matters more than the data. -->
263+
<div class="mb-4 rounded-lg border border-amber-200 bg-amber-50 p-3 text-xs leading-relaxed text-amber-900">
264+
<p class="font-semibold mb-1">This is a reproducibility journal — not a leaderboard.</p>
265+
<p class="text-amber-800">
266+
Submissions document how <em>reference</em> agents and models score over time, across
267+
infrastructures, cube versions, and package versions. Use it to detect drift and validate
268+
environments. <strong>Not</strong> a place to publish a new agent or fine-tune to "win" — there is no
269+
ranking, scores are self-reported, and submissions are unverified. To showcase a new
270+
agent or model, use ATLAS / EEE / your own benchmark page.
271+
</p>
272+
</div>
273+
274+
275+
<p class="text-sm text-gray-400 text-center py-6">
276+
No submissions yet. Be the first — see <a href="https://github.com/The-AI-Alliance/cube-registry/blob/main/README.md#submitting-a-result" class="text-indigo-600 hover:underline">how to submit</a>.
277+
</p>
278+
279+
</section>
280+
251281
<!-- ── Parallelization info ──────────────────────────────────────────── -->
252282

253283
<section class="bg-white rounded-xl border border-gray-200 p-5">
@@ -348,7 +378,7 @@ <h2 class="text-sm font-semibold text-gray-200 tracking-wide uppercase">Registry
348378
<p>
349379
CUBE Registry — <a href="https://github.com/The-AI-Alliance/cube-registry" class="underline">GitHub</a> ·
350380
<a href="../" class="underline">All benchmarks</a> ·
351-
Generated 2026-06-01 20:11 UTC
381+
Generated 2026-06-03 18:46 UTC
352382
</p>
353383
</footer>
354384

docs/workarena/index.html

Lines changed: 31 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -237,6 +237,36 @@ <h2 class="text-lg font-bold mb-4">Legal</h2>
237237
</section>
238238

239239

240+
<!-- ── Community results journal ─────────────────────────────────────── -->
241+
<section class="bg-white rounded-xl border border-gray-200 p-5">
242+
<div class="flex items-baseline justify-between mb-3">
243+
<h2 class="text-lg font-bold">
244+
Reproducibility journal
245+
246+
</h2>
247+
<a href="https://github.com/The-AI-Alliance/cube-registry/blob/main/README.md#submitting-a-result"
248+
class="text-xs text-indigo-600 hover:underline">How to submit →</a>
249+
</div>
250+
251+
<!-- Anti-leaderboard callout ─ the framing matters more than the data. -->
252+
<div class="mb-4 rounded-lg border border-amber-200 bg-amber-50 p-3 text-xs leading-relaxed text-amber-900">
253+
<p class="font-semibold mb-1">This is a reproducibility journal — not a leaderboard.</p>
254+
<p class="text-amber-800">
255+
Submissions document how <em>reference</em> agents and models score over time, across
256+
infrastructures, cube versions, and package versions. Use it to detect drift and validate
257+
environments. <strong>Not</strong> a place to publish a new agent or fine-tune to "win" — there is no
258+
ranking, scores are self-reported, and submissions are unverified. To showcase a new
259+
agent or model, use ATLAS / EEE / your own benchmark page.
260+
</p>
261+
</div>
262+
263+
264+
<p class="text-sm text-gray-400 text-center py-6">
265+
No submissions yet. Be the first — see <a href="https://github.com/The-AI-Alliance/cube-registry/blob/main/README.md#submitting-a-result" class="text-indigo-600 hover:underline">how to submit</a>.
266+
</p>
267+
268+
</section>
269+
240270
<!-- ── Parallelization info ──────────────────────────────────────────── -->
241271

242272
<section class="bg-white rounded-xl border border-gray-200 p-5">
@@ -327,7 +357,7 @@ <h2 class="text-sm font-semibold text-gray-200 tracking-wide uppercase">Registry
327357
<p>
328358
CUBE Registry — <a href="https://github.com/The-AI-Alliance/cube-registry" class="underline">GitHub</a> ·
329359
<a href="../" class="underline">All benchmarks</a> ·
330-
Generated 2026-06-01 20:11 UTC
360+
Generated 2026-06-03 18:46 UTC
331361
</p>
332362
</footer>
333363

0 commit comments

Comments
 (0)