Skip to content

Commit 0a9bf96

Browse files
authored
add link to blog post (#1966)
1 parent 333da9c commit 0a9bf96

File tree

1 file changed

+9
-7
lines changed

1 file changed

+9
-7
lines changed

docs/semgrep-assistant/metrics.md

+9-7
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@ Metrics for evaluating Semgrep Assistant's performance are derived from two sour
1515
- **User feedback** on Assistant recommendations within the product
1616
- **Internal triage and benchmarking** conducted by Semgreps security research team
1717

18-
This methodology ensures that Assistant is evaluated from both a user's and expert's perspective. This gives Semgrep's product and engineering teams a holistic view into Assistant's real-world performance.
18+
This methodology ensures that Assistant is evaluated from both a user's and expert's perspective. This gives Semgrep's product and engineering teams a holistic view into Assistant's real-world performance.[^1]
1919

2020
## User feedback
2121

@@ -35,7 +35,7 @@ Users are prompted in-line to "thumbs up" or "thumbs down" Assistant suggestions
3535
<td><strong>250,000+</strong></td>
3636
</tr>
3737
<tr>
38-
<td>Average reduction in findings[^1]</td>
38+
<td>Average reduction in findings[^2]</td>
3939
<td><strong>20%</strong></td>
4040
</tr>
4141
<tr>
@@ -64,17 +64,19 @@ Internal benchmarks for Assistant run on the same dataset used by Semgrep's secu
6464
<td><strong>2000+</strong></td>
6565
</tr>
6666
<tr>
67-
<td>False positive confidence rate[^2]</td>
67+
<td>False positive confidence rate[^3]</td>
6868
<td><strong>96%</strong></td>
6969
</tr>
7070
<tr>
71-
<td>Remediation guidance confidence rate[^3]</td>
71+
<td>Remediation guidance confidence rate[^4]</td>
7272
<td><strong>80%</strong></td>
7373
</tr>
7474
</table>
7575

76-
[^1]:The average % of SAST findings that Assistant filters out as noise.
76+
[^1]: Learn more about how Semgrep achieved these numbers in [How we built an AppSec AI that security researchers agree with 96% of the time](https://semgrep.dev/blog/2025/building-an-appsec-ai-that-security-researchers-agree-with-96-of-the-time/).
7777

78-
[^2]:False positive confidence rate measures how often Assistant is correct when it identifies a false positive. **A high confidence rate means users can trust when Assistant identifies a false positive - it does not mean that Assistant catches all false positives.**
78+
[^2]:The average % of SAST findings that Assistant filters out as noise.
7979

80-
[^3]:Remediation guidance is rated on a binary scale of "helpful" / "not helpful".
80+
[^3]:False positive confidence rate measures how often Assistant is correct when it identifies a false positive. **A high confidence rate means users can trust when Assistant identifies a false positive - it does not mean that Assistant catches all false positives.**
81+
82+
[^4]:Remediation guidance is rated on a binary scale of "helpful" / "not helpful".

0 commit comments

Comments
 (0)