Skip to content

Commit

Permalink
feat(sca): new SCA stats schema (#313)
Browse files Browse the repository at this point in the history
# Context

The [existing "bytes parsed"
metric](https://metabase.corp.semgrep.dev/dashboard/272-ssc-adoption-dashboard?tab=147-summary)
we are currently relying on provides a limited perspective on customer
SSC adoption, as it aggregates usage across languages without fully
capturing which languages are meaningfully used.

We want to augment the stats schema to improve metrics around Java
dependency detection, lockfile-less scanning, and python requirements
identification.

- [x] I ran `make setup && make` to update the generated code after
editing a `.atd` file (TODO: have a CI check)
- [x] I made sure we're still backward compatible with old versions of
the CLI.
For example, the Semgrep backend need to still be able to *consume* data
generated
	  by Semgrep 1.17.0.
See
https://atd.readthedocs.io/en/latest/atdgen-tutorial.html#smooth-protocol-upgrades
  • Loading branch information
salolivares authored Nov 19, 2024
1 parent a95d339 commit b35746b
Show file tree
Hide file tree
Showing 7 changed files with 5,154 additions and 3,447 deletions.
44 changes: 44 additions & 0 deletions semgrep_output_v1.atd
Original file line number Diff line number Diff line change
Expand Up @@ -1687,6 +1687,50 @@ type ci_scan_complete_stats = {
<python repr="dict">
<ts repr="map">
option;

(* since 1.98.0 *)
(* In collaboration with the Data Science team, it was suggested
* that we start to group stats by product for organizational purposes.
*
* This field will only be defined for SCA scans.
*)
?supply_chain_stats: supply_chain_stats option;
}

type resolution_method <ocaml attr="deriving show"> <python decorator="dataclass(frozen=True, order=True)"> = [
| LockfileParsing
| DynamicResolution
]

type dependency_resolution_stats = {
resolution_method: resolution_method;
dependency_count: int;
ecosystem: ecosystem;
}

type dependency_source_file_kind <ocaml attr="deriving show"> <python decorator="dataclass(frozen=True)"> = [
| Lockfile of lockfile_kind
| Manifest of manifest_kind
]

type dependency_source_file = {
kind: dependency_source_file_kind;
path: fpath;
}

type subproject_stats = {
(* The `subproject_id` is derived as a stable hash of the sorted paths of `dependency_source_file`s.
Any change to the set of dependency sources (addition, removal, or modification) results in a new
`subproject_id`, as different dependency sources indicate a different subproject context. *)
subproject_id: string;
(* Files used to determine the subproject's dependencies (lockfiles, manifest files, etc) *)
dependency_sources: dependency_source_file list;
(* Results of dependency resolution, empty if resolution failed *)
?resolved_stats: dependency_resolution_stats option;
}

type supply_chain_stats = {
subprojects_stats: subproject_stats list;
}

type parsing_stats = {
Expand Down
69 changes: 69 additions & 0 deletions semgrep_output_v1.jsonschema

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

24 changes: 23 additions & 1 deletion semgrep_output_v1.proto

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading

0 comments on commit b35746b

Please sign in to comment.