|
45 | 45 | </div> |
46 | 46 | <h1>Ask your codebase precise questions.<br>Get grounded answers.</h1> |
47 | 47 | <p class="hero-sub"> |
48 | | - Noumenon builds a <a href="https://www.datomic.com">Datomic</a> knowledge graph from your repository so agents query structured facts instead of dumping raw files into context windows. In benchmarks across 9 repos and 8 languages, graph-augmented answers scored <strong>2× higher</strong> on average. |
| 48 | + Noumenon builds a <a href="https://www.datomic.com">Datomic</a> knowledge graph from your repository so agents query structured facts instead of dumping raw files into context windows. In benchmarks across 9 repos and 8 languages, graph-augmented answers scored <strong>2.8× higher</strong> on average. |
49 | 49 | </p> |
50 | 50 | <div class="hero-actions"> |
51 | 51 | <a href="#get-started" class="btn btn-primary">Get Started</a> |
@@ -85,7 +85,7 @@ <h2 class="section-title">Why structured knowledge</h2> |
85 | 85 | <div class="problem-grid"> |
86 | 86 | <div class="card"> |
87 | 87 | <h3>Context windows don't scale</h3> |
88 | | - <p>A Datalog query returns exactly the entities a question needs. In benchmarks, graph context improved LLM accuracy by <strong>+19.7 percentage points</strong> on average.</p> |
| 88 | + <p>A Datalog query returns exactly the entities a question needs. In benchmarks, graph context improved LLM accuracy by <strong>+27.5 percentage points</strong> on average.</p> |
89 | 89 | </div> |
90 | 90 | <div class="card"> |
91 | 91 | <h3>Answers you can't verify</h3> |
@@ -139,7 +139,7 @@ <h3>Recursive LLM Querying</h3> |
139 | 139 | </div> |
140 | 140 | <div class="principle"> |
141 | 141 | <span class="principle-label text-purple">Measurable</span> |
142 | | - <p>Built-in A/B benchmarks. 9 repos, 8 languages: 18.2% (raw) to 37.9% (graph-augmented).</p> |
| 142 | + <p>Built-in A/B benchmarks. 9 repos, 8 languages: 15.2% (raw) to 42.7% (graph-augmented).</p> |
143 | 143 | </div> |
144 | 144 | <div class="principle"> |
145 | 145 | <span class="principle-label text-muted">Local</span> |
@@ -209,36 +209,36 @@ <h2 class="section-title">Measured on real codebases</h2> |
209 | 209 | </tr> |
210 | 210 | </thead> |
211 | 211 | <tbody> |
212 | | - <tr><td>flask</td><td>Python</td><td>12.5%</td><td>41.2%</td><td class="text-green"><strong>+28.8pp</strong></td></tr> |
213 | | - <tr><td>fzf</td><td>Go</td><td>13.8%</td><td>42.5%</td><td class="text-green"><strong>+28.8pp</strong></td></tr> |
214 | | - <tr><td>express</td><td>JavaScript</td><td>18.8%</td><td>45.0%</td><td class="text-green"><strong>+26.2pp</strong></td></tr> |
215 | | - <tr><td>fresh</td><td>TypeScript</td><td>12.5%</td><td>35.0%</td><td class="text-green"><strong>+22.5pp</strong></td></tr> |
216 | | - <tr><td>guava</td><td>Java</td><td>2.5%</td><td>23.8%</td><td class="text-green"><strong>+21.3pp</strong></td></tr> |
217 | | - <tr><td>ripgrep</td><td>Rust</td><td>12.5%</td><td>30.0%</td><td class="text-green"><strong>+17.5pp</strong></td></tr> |
218 | | - <tr><td>redis</td><td>C</td><td>11.3%</td><td>26.3%</td><td class="text-green"><strong>+15.0pp</strong></td></tr> |
219 | | - <tr><td>ring</td><td>Clojure</td><td>51.2%</td><td>60.0%</td><td class="text-green"><strong>+8.8pp</strong></td></tr> |
220 | | - <tr><td>noumenon</td><td>Clojure</td><td>28.8%</td><td>37.5%</td><td class="text-green"><strong>+8.8pp</strong></td></tr> |
| 212 | + <tr><td>fresh</td><td>TypeScript</td><td>0.0%</td><td>41.3%</td><td class="text-green"><strong>+41.3pp</strong></td></tr> |
| 213 | + <tr><td>fzf</td><td>Go</td><td>2.5%</td><td>38.8%</td><td class="text-green"><strong>+36.3pp</strong></td></tr> |
| 214 | + <tr><td>ripgrep</td><td>Rust</td><td>2.5%</td><td>37.5%</td><td class="text-green"><strong>+35.0pp</strong></td></tr> |
| 215 | + <tr><td>flask</td><td>Python</td><td>10.0%</td><td>41.3%</td><td class="text-green"><strong>+31.3pp</strong></td></tr> |
| 216 | + <tr><td>redis</td><td>C</td><td>2.5%</td><td>26.3%</td><td class="text-green"><strong>+23.8pp</strong></td></tr> |
| 217 | + <tr><td>express</td><td>JavaScript</td><td>31.3%</td><td>53.8%</td><td class="text-green"><strong>+22.5pp</strong></td></tr> |
| 218 | + <tr><td>noumenon</td><td>Clojure</td><td>23.8%</td><td>45.0%</td><td class="text-green"><strong>+21.3pp</strong></td></tr> |
| 219 | + <tr><td>guava</td><td>Java</td><td>7.5%</td><td>26.3%</td><td class="text-green"><strong>+18.8pp</strong></td></tr> |
| 220 | + <tr><td>ring</td><td>Clojure</td><td>56.3%</td><td>73.8%</td><td class="text-green"><strong>+17.5pp</strong></td></tr> |
221 | 221 | </tbody> |
222 | 222 | <tfoot> |
223 | | - <tr><td><strong>Average</strong></td><td></td><td><strong>18.2%</strong></td><td><strong>37.9%</strong></td><td class="text-green"><strong>+19.7pp</strong></td></tr> |
| 223 | + <tr><td><strong>Average</strong></td><td></td><td><strong>15.2%</strong></td><td><strong>42.7%</strong></td><td class="text-green"><strong>+27.5pp</strong></td></tr> |
224 | 224 | </tfoot> |
225 | 225 | </table> |
226 | 226 | <div class="benchmark-notes"> |
227 | 227 | <div class="card"> |
228 | 228 | <h3>Biggest gains on unfamiliar repos</h3> |
229 | | - <p>Flask, fzf, and Express saw +26–29pp — the graph fills in what the LLM lacks from training data.</p> |
| 229 | + <p>Fresh scored 0% without the graph and 41% with it — the knowledge graph provides what the LLM simply doesn't know.</p> |
230 | 230 | </div> |
231 | 231 | <div class="card"> |
232 | | - <h3>Factual lookups improved most</h3> |
233 | | - <p>Single-hop accuracy (e.g. “which files import X?”) jumped from 29.5% to 65.9% on Ring — +36pp.</p> |
| 232 | + <h3>Best overall: Ring at 73.8%</h3> |
| 233 | + <p>Ring scored the highest absolute Full score. Deterministic accuracy hit 75.0%, with LLM-judged questions close behind at 72.2%.</p> |
234 | 234 | </div> |
235 | 235 | <div class="card"> |
236 | 236 | <h3>8 languages, zero failures</h3> |
237 | 237 | <p>Clojure, Python, JavaScript, TypeScript, C, Go, Rust, and Java all completed the full pipeline successfully.</p> |
238 | 238 | </div> |
239 | 239 | </div> |
240 | 240 | <p class="section-sub" style="margin-top: 1.5rem;"> |
241 | | - <a href="https://github.com/leifericf/noumenon/blob/main/reports/digest-run-2026-03-27.md">Read the full benchmark report →</a> |
| 241 | + Last updated: 2026-03-28. <a href="https://github.com/leifericf/noumenon/blob/main/reports/digest-run-2026-03-28.md">Read the full benchmark report →</a> |
242 | 242 | </p> |
243 | 243 | </div> |
244 | 244 | </section> |
|
0 commit comments