Releases: semgrep/semgrep-interfaces
Release v1.65.0
Release v1.64.0
1.64.0 - 2024-03-07
Changed
- Removed the AST caching experimental feature (--experimental --ast-caching
in osemgrep and -parsing_cache_dir in semgrep-core). (ast_caching) - Removed the Registry caching experimental feature (--experimental --registry-caching)
in osemgrep. (registry_caching)
Fixed
- Clean any credentials from project URL before using it, to prevent leakage. (saf-876)
ci
: Updated logic for informational message printed when no rules are sent to
correctly display when secrets is enabled (in additional to
when code is). (scrt-455)
Release v1.63.0
1.63.0 - 2024-02-27
Added
- Dataflow: Added support for nested record patterns such as
{ body: { param } }
in the LHS of an assignment. Now given{ body: { param } } = tainted
Semgrep
will correctly markparam
as tainted. (flow-68) - Matching:
metavariable-regex
can now match on metavariables of interpolated
strings which use variables that have known values. (saf-865) - Add support for parsing Swift Package Manager manifest and lockfiles (sc-1217)
Fixed
- fix: taint signatures do not capture changes to parameters' fields (flow-70)
- Scan summary links printed after
semgrep ci
scans now reflect a custom SEMGREP_APP_URL, if one is set. (saf-353)
Release v1.62.0
1.62.0 - 2024-02-22
Added
-
Pro: Adds support for python constructors to taint analysis.
If interfile naming resolves that a python constructor is called taint
will now track these objects with less heuristics. Without interfile
analysis these changes have no effect on the behavior of tainting.
The overall result is that in the following program the oss analysis
would match both calls to sink while the interfile analysis would only
match the second call to sink.class A: untainted = "not" tainted = "not" def __init__(self, x): self.tainted = x a = A("tainted") # OK: sink(a.untainted) # MATCH: sink(a.tainted) ``` (ea-272)
-
Pro: taint-mode: Added basic support for "index sensitivity", that is,
Semgrep will track taint on individual indexes of a data structure when
these are constant values (integers or strings), and the code uses the
built-in syntax for array indexing in the corresponding language
(typicallyE[i]
). For example, in the Python code below Semgrep Pro
will not report a finding onsink(x)
orsink(x[1])
because it will
know that onlyx[42]
is tainted:x[1] = safe x[42] = source() sink(x) // no more finding sink(x[1]) // no more finding sink(x[42]) // finding sink(x[i]) // finding
There is still a finding for
sink(x[i])
wheni
is not constant. (flow-7)
Changed
-
taint-mode: Added
exact: false
sinks so that one can specify that anything
inside a code region is a sink, e.g.if (...) { ... }
. This used to be the
semantics of sink specifications until Semgrep 1.1.0, when we made sink matching
more precise by default. Now we allow reverting to the old semantics.In addition, when
exact: true
(the default), we simplified the heuristic used
to support traditionalsink(...)
-like specs together with the option
taint_assume_safe_functions: true
, now we will consider that if the spec
formula is not apatterns
with afocus-metavarible
, then we must look for
taint in the arguments of a function call. (flow-1) -
The project name for repos scanned locally will now be
local_scan/<repo_name>
instead
of simply<repo_name>
. This will clarify the origin of those findings. Also, the
"View Results" URL displayed for findings now includes the repository and branch names. (saf-856)
Fixed
- taint-mode: experimental: For now Semgrep CLI taint traces are not adapted to
support multiple labels, so Semgrep picks one arbitrary label to report, which
sometimes it's not the desired one. As a temporary workaround, Semgrep will
look at therequires
of the sink, and if it has the shapeA and ...
, then
it will pickA
as the preferred label and report its trace. (flow-65) - Fixed trailing newline parsing in pyproject.toml and poetry.lock files. (gh-9777)
- Fixed an issue that led to incorrect autofix application in certain cases where multiple fixes were applied to the same line. (saf-863)
- The tokens for type parameters brackets are now stored in the generic AST allowing
to correctly autofix those constructs. (tparams)
Release v1.61.1
1.61.1 - 2024-02-14
Added
-
Added performance metrics using OpenTelemetry for better visualization.
Users wishing to understand the performance of their Semgrep scans or
to help optimize Semgrep can configure the backend collector created in
libs/tracing/unix/Tracing.ml
.This is experimental and both the implementation and flags are likely to
change. (ea-320) -
Created a new environment variable SEMGREP_REPO_DISPLAY_NAME for use in semgrep CI.
Currently, this does nothing. The goal is to provide a way to override the display
name of a repo in the Semgrep App. (gh-8953) -
The OCaml/C executable (
semgrep-core
orosemgrep
) is now passed through
thestrip
utility, which reduces its size by 10-25% depending on the
platform. Contribution by Filipe Pina (@fopina). (gh-9471)
Changed
- "Missing plugin" errors (i.e., rules that cannot be run without
--pro
) will now
be grouped and reported as a single warning. (ea-842)
Release v1.60.1
1.60.1 - 2024-02-09
Added
-
Rule syntax: Metavariables by the name of
$_
are now anonymous, meaning that
they do not unify within a single pattern or across patterns, and essentially
just unconditionally specify some expression.For instance, the pattern
foo($_, $_)
may match the codefoo(1, 2)
.This will change the behavior of existing rules that use the metavariable
$_
, if they rely on unification still happening. This can be fixed by simply
giving the metavariable a real name like$A
. (ea-837) -
Added infrastructure for semgrep supply chain in semgrep-core. Not fully functional yet. (ssc-port)
Changed
-
Dataflow: Simplified the IL translation for Python
with
statements to let
symbolic propagation assume thatwith foo() as x: ...
entailsx = foo()
,
so that e.g.Session().execute("...")
matches:with Session() as s: s.execute("SELECT * from T") (CODE-6633)
Fixed
- Output: Semgrep CLI now no longer sometimes interpolated metavariables twice, if
the message that was substituted for a metavariable itself contained a valid
metavariable to be interpolated (ea-838)
Release v1.60.0
1.60.0 - 2024-02-08
Added
-
Rule syntax: Metavariables by the name of
$_
are now anonymous, meaning that
they do not unify within a single pattern or across patterns, and essentially
just unconditionally specify some expression.For instance, the pattern
foo($_, $_)
may match the codefoo(1, 2)
.This will change the behavior of existing rules that use the metavariable
$_
, if they rely on unification still happening. This can be fixed by simply
giving the metavariable a real name like$A
. (ea-837) -
Added infrastructure for semgrep supply chain in semgrep-core. Not fully functional yet. (ssc-port)
Fixed
- Output: Semgrep CLI now no longer sometimes interpolated metavariables twice, if
the message that was substituted for a metavariable itself contained a valid
metavariable to be interpolated (ea-838)
Release v1.59.1
1.59.1 - 2024-02-02
Added
-
taint-mode: Pro: Semgrep can now track taint via static class fields and global
variables, such as in the following example:static char* x; void foo() { x = "tainted"; } void bar() { sink(x); } void main() { foo(); bar(); } ``` (pa-3378)
Fixed
- Pro: Make inter-file analysis more tolerant to small bugs, resorting to graceful
degradation and continuing with the scan, rather than crashing. (pa-3387)
Release v1.59.0
1.59.0 - 2024-01-30
Added
- Swift: Now supports typed metavariables, such as
($X : ty)
. (pa-3370)
Changed
-
Add Elixir to Pro languages list in help information. (gh-9609)
-
Removed
sg
alias to avoid naming conflicts
with the shadow-utilssg
command for Linux systems. (gh-9642) -
Prevent unnecessary computation when running scans without verbose logging enabled (gh-9661)
-
Deprecated option
taint_match_on
introduced in 1.51.0, it is being renamed
totaint_focus_on
. Note thattaint_match_on
was experimental, and
taint_focus_on
is experimental too. Optiontaint_match_on
will continue
to work but it will be completely removed at some point after 1.63.0. (pa-3272) -
Added information on product-related flags to help output, especially for Semgrep Secrets. (pa-3383)
-
taint-mode: Improve inference of best matches for exact-sources, exact-sanitizers,
and sinks. Now we also avoid FPs in cases such as:dangerouslySetInnerHTML = { // ok: {__html: props ? DOMPurify.sanitize(props.text) : ''} // no more FPs! }
where
props
is tainted and the sink specification is:patterns: - pattern: | dangerouslySetInnerHTML={{__html: $X}} - focus-metavariable: $X
Previously Semgrep wrongly considered the individual subexpressions of the
conditional as sinks, including theprops
inprops ? ...
, thus producing a
false positive. Now it will only consider the conditional expression as a whole
as the sink. (rules-6457) -
Removed an internal legacy syntax for secrets rules (
mode: semgrep_internal_postprocessor
). (scrt-320)
Fixed
-
Autofix: Fixes that span multiple lines will now try to align
inserted fixed lines with each other. (gh-3070) -
Matching: Try blocks with catch clauses can now match try blocks that have
extraneous catch clauses, as long as it matches a subset. For instance,
the patterntry: ... catch A: ...
can now match
try: ... catch A: ... catch B: ... ``` (gh-3362)
-
Previously, some people got the error:
Encountered error when running rules: Other syntax error at line NO FILE INFO YET:-1: Invalid_argument: String.sub / Bytes.sub
Semgrep should now report this error properly with a file name and line number and
handle it gracefully. (gh-9628) -
Fixed Dockerfile parsing bug where multiline comments were parsed incorrectly. (gh-9628-2)
-
The language server will now properly respect findings that have been ignored via the app (lsp-fingerprints)
-
taint-mode: Pro: Semgrep will now propagate taint via instance variables when
calling methods within the same class, making this example work:class Test { private String str; public setStr() { this.str = "tainted"; } public useStr() { //ruleid: test sink(this.str); } public test() { setStr(); useStr(); } } ``` (pa-3372)
-
taint-mode: Pro: Taint traces will now reflect when taint is propagated via
class fields, such as in this example:class Test { private String str; public setStr() { this.str = "tainted"; } public useStr() { //ruleid: test sink(this.str); } public test() { setStr(); useStr(); } }
Previously Semgrep will report that taint originated at
this.str = "tainted"
,
but it would not tell you how the control flow got there. Now the taint trace
will indicate that we get there by callingsetStr()
insidetest()
. (pa-3373) -
Addressed an issue related to matching top-level identifiers with meta-variable
qualified patterns in C++, such as matching ::foo with ::$A::$B. This problem
was specific to Pro Engine-enabled scans. (pa-3375)
Release v1.58.0
1.58.0 - 2024-01-23
Added
-
Added a severity icon (e.g. "❯❯❱") and corresponding color to our CLI text output
for findings of known severity. (grow-97) -
Naming has better support for if statements. In particular, for
languages with block scope, shadowed variables inside if-else blocks
that are tainted won't "leak" outside of those blocks.This helps with features related to naming, such as tainting.
For example, previously in Go, the x in sink(x) will report
that x is tainted, even though the x that is tainted is the
one inside the scope of the if block.func f() { x := "safe"; if (c) { x := "tainted"; } // x should not be tainted sink(x); }
This is now fixed. (pa-3185)
-
OSemgrep can now scan remote git repositories. Pass
--experimental --pro --remote http[s]://<website>/.../<repo>.git
to use this feature (pa-remote)
Changed
- Rules stored under an "hidden" directory (e.g., dir/.hidden/myrule.yml)
are now processed when using --config .
We used to skip dot files under dir, but keeping rules/.semgrep.yml,
but not path/.github/foo.yml, but keeping src/.semgrep/bad_pattern.yml
but not ./.pre-commit-config.yaml, ... This was mainly because
we used to fetch rules from ~/.semgrep/ implicitely when --config
was not given, but this feature was removed, so now we can keep it simple. (hidden_rules) - Removed support for writing rules using jsonnet. This feature
will be restored once we finish the port to OCaml of the semgrep CLI. (jsonnet) - The primitive object construct expression will no longer match the new
expression pattern. For example, the patternnew $TYPE
will now only match
new int
, notint()
. (pa-3336) - The placement new expression will no longer match the new expression without
placement. For instance, the patternnew ($STORAGE) $TYPE
will now only match
new (storage) int
and notnew int
. (pa-3338)
Fixed
-
Java: You can now use metavariable ellipses properly in
function arguments, as statements, and as expressions.For instance, you may write the pattern
public $F($...ARGS) { ... } ``` (gh-9260)
-
Nosemgrep: Fixed a bug where Semgrep would err upon reading a
nosemgrep
comment with multiple rule IDs. (gh-9463) -
Fixed bugs in gitignore/semgrepignore globbing implementation affecting
--experimental
. (gh-9544) -
Fixed rule IDs, descriptions, findings, and autofix text not wrapping as expected.
Use newline instead of horiziontal separator for findings with a shared file
but for different rules per design spec. (grow-97) -
Keep track of the origin of
return;
statements in the dataflow IL so that
recently added (Pro-only)at-exit: true
sinks work properly on them. (pa-3337) -
C++: Improve translation of
delete
expressions to the dataflow IL so that
recently added (Pro-only)at-exit: true
sinks work on them. Previously
delete
expression at "exit" positions were not being properly recognized
as such. (pa-3339) -
cli: fix python runtime error with 0 width wrapped printing (pa-3366)
-
Fixed a bug where Gemfile.lock files with multiple GEM sections
would not be parsed correctly. (sc-1230)