add link to blog post (#1966)

khorne3 · web-flow · commit 0a9bf9694529 · 2025-02-12T08:43:46.000-06:00
diff --git a/docs/semgrep-assistant/metrics.md b/docs/semgrep-assistant/metrics.md
@@ -15,7 +15,7 @@ Metrics for evaluating Semgrep Assistant's performance are derived from two sour
 - **User feedback** on Assistant recommendations within the product
 - **Internal triage and benchmarking** conducted by Semgreps security research team 
 
-This methodology ensures that Assistant is evaluated from both a user's and expert's perspective. This gives Semgrep's product and engineering teams a holistic view into Assistant's real-world performance. 
+This methodology ensures that Assistant is evaluated from both a user's and expert's perspective. This gives Semgrep's product and engineering teams a holistic view into Assistant's real-world performance.[^1]
 
 ## User feedback
 
@@ -35,7 +35,7 @@ Users are prompted in-line to "thumbs up" or "thumbs down" Assistant suggestions
         <td><strong>250,000+</strong></td>
     </tr>
     <tr>
-        <td>Average reduction in findings[^1]</td>
+        <td>Average reduction in findings[^2]</td>
         <td><strong>20%</strong></td>
     </tr>
     <tr>
@@ -64,17 +64,19 @@ Internal benchmarks for Assistant run on the same dataset used by Semgrep's secu
         <td><strong>2000+</strong></td>
     </tr>
     <tr>
-        <td>False positive confidence rate[^2]</td>
+        <td>False positive confidence rate[^3]</td>
         <td><strong>96%</strong></td>
     </tr>
     <tr>
-        <td>Remediation guidance confidence rate[^3]</td>
+        <td>Remediation guidance confidence rate[^4]</td>
         <td><strong>80%</strong></td>
     </tr>
 </table>
 
-[^1]:The average % of SAST findings that Assistant filters out as noise.
+[^1]: Learn more about how Semgrep achieved these numbers in [How we built an AppSec AI that security researchers agree with 96% of the time](https://semgrep.dev/blog/2025/building-an-appsec-ai-that-security-researchers-agree-with-96-of-the-time/).
 
-[^2]:False positive confidence rate measures how often Assistant is correct when it identifies a false positive. **A high confidence rate means users can trust when Assistant identifies a false positive - it does not mean that Assistant catches all false positives.** 
+[^2]:The average % of SAST findings that Assistant filters out as noise.
 
-[^3]:Remediation guidance is rated on a binary scale of "helpful" / "not helpful".
+[^3]:False positive confidence rate measures how often Assistant is correct when it identifies a false positive. **A high confidence rate means users can trust when Assistant identifies a false positive - it does not mean that Assistant catches all false positives.** 
+
+[^4]:Remediation guidance is rated on a binary scale of "helpful" / "not helpful".