-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Define TR caching comms in ATD #353
base: main
Are you sure you want to change the base?
Conversation
To make things more concrete about the interface we want with the backend. test plan: make
Backwards compatibility summary:
|
type tr_cache_key = { | ||
rule_id: rule_id; | ||
(* ex: http://some-website/hello-world.0.1.2.tgz like in found_dependency *) | ||
resolved_url: string; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if this will be easy to find in many cases, though I agree that if it's possible it makes the best key. I guess it is probably fine to start with this, and add another key later if needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes we can always refine. This is just defining the interface. Once we start the implementation we will discover
we need to refine it.
* and [transitive_unreachable] records? | ||
* TODO? make it a list? match_results: ... list; ? | ||
*) | ||
match_result: sca_match_kind; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a bit odd, since when scanning a package with a rule it will result in direct code matches, not sca matches. Maybe we should just return the match here? Then, the CLI could treat those matches the same as matches that it receives from a call to Semgrep locally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could. This is a bigger data structure to store then though, and for TR what we really need is actually just the sca_transitive_match_kind; that's the thing we try to optimize to avoid downloading the dependency and run semgrep on it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should at least store the match locations. I don't think it makes sense to store sca_match_kind
because if there are matches in multiple packages, we will somehow need to combine those into a single finding in the cli
To make things more concrete about the interface we want
with the backend.
test plan:
make
make setup && make
to update the generated code after editing a.atd
file (TODO: have a CI check)For example, the Semgrep backend need to still be able to consume data
generated by Semgrep 1.50.0.
See https://atd.readthedocs.io/en/latest/atdgen-tutorial.html#smooth-protocol-upgrades
Note that the types related to the semgrep-core JSON output or the
semgrep-core RPC do not need to be backward compatible!