Releases: semgrep/semgrep-interfaces
Release v1.30.0
1.30.0 - 2023-06-28
Added
-
feat(rule syntax): Support metavariable-type field for Kotlin, Go, Scala
metavariable-type
field is now supported for Kotlin, Go and Scala. (gh-8147) -
feat(rule syntax): Support metavariable-type field for csharp, typescript, php, rust
metavariable-type
field is now supported for csharp, typescript, php, rust. (gh-8164) -
Pattern syntax: You may now introduce metavariables from parts of regular
expressions usingpattern-regex
, by using regular expression with
named capturing groups (see https://www.regular-expressions.info/named.html)Now, such capture group metavariables must be explicitly named.
So for instance, the pattern:pattern-regex: "foo-(?P<X>.*)"
binds what is matched by the capture group to the metavariable
$X
,
which can be used as normal.pattern-regex
patterns with capture groups, such
aspattern-regex: "(.*)"
will still introduce metavariables of the form
$1
,$2
, etc, but this
should be considered deprecated behavior, and that functionality will be
taken away in a future release. Named capturing groups should be primarily
used, instead. (pa-2765) -
Rule syntax: Errors during rule parsing are now better. For instance,
parsing will now complain if you miss a hyphen in a list of patterns,
or if you try to give a string topatterns
orpattern-either
. (pa-2877) -
JS/TS: Now, patterns of records with ellipses, like:
{ $X: ... }
properly match to records of anonymous functions, like:
{ func: () => { return 1; } } ``` (pa-2878)
Changed
- engine: Removed matching cache optimization which had been previously disabled by
default in 1.22.0 (we got no reports of any performance regression during this time). (cleanup-1)
Fixed
- Language server no longer crashes when a user is logged in and opens a non git repo folder (pa-2886)
- It is not required anymore to have semgrep (and pysemgrep) in the PATH. (pa-2895)
Release v1.29.0
1.29.0 - 2023-06-26
Added
-
feat(rule syntax): Metavariable Type Extension for Semgrep Rule Syntax
We've added a dedicated field for annotating the type information of
metavariables. By adopting this approach, instead of relying solely on
language-specific casting syntax, we provide an additional way to enhance
the overall usability by eliminating the need to write redundant type cast
expressions for a single metavariable.Moreover, the new syntax brings other benefits, including improved support for
target languages that lack built-in casting syntax. It also promotes a unified
approach to expressing type, pattern, and regex constraints for metavariables,
resulting in improved consistency across rule definitions.Current syntax:
rules: - id: no-string-eqeq severity: WARNING message: find errors languages: - java patterns: - pattern-not: null == (String $Y) - pattern: $X == (String $Y)
Added syntax:
rules: - id: no-string-eqeq severity: WARNING message: find errors languages: - java patterns: - pattern-not: null == $Y - pattern: $X == $Y - metavariable-type: metavariable: $Y type: String ``` (gh-8119)
-
feat(rule syntax): Support metavariable-type field for Python
metavariable-type
field is now supported for Python too. (gh-8126) -
New --experimental flag to switch to a new implementation of Semgrep entirely
written in OCaml with faster startup time, incremental display of matches,
AST and registry caching, a new interactive mode and more. Not all
features of the legacy Python Semgrep have been ported though. (osemgrep) -
Matching: Writing a pattern which is a sequence of statements, such as
foo(); ... bar();
now allows matching to sequences of statements within objects, classes,
and related language constructs, in all languages. (pa-2754)
Changed
- taint-mode: Several improvements to
taint_assume_safe_{booleans,numbers}
options.
Most notably, we will now use type info provided by explicit type casts, and we will
also use const-prop info to infer types. (pa-2777)
Fixed
- Added support for post-pip0614 decorators; now semgrep accepts decorators of
the form@ named_expr_test NEWLINE
, so for example with the pattern
lambda $X:$X($X)
:#match 1 @omega := lambda ha:ha(ha) def func(): return None #match 2 @omega[lambda a:a(a)].a.b.c.f("wahoo") def fun(): return None ``` (gh-4946)
- Fixed a typing issue with go; where semgrep with the pattern
'($VAR : *tau.rho).$F()` wouldn't produce a match in the
following:but now we don't miss those two findings! (gh-6733)func f() { i_1 := &tau.rho{} i_2 := new(tau.rho) i_1.shift() //miss one i_2.left() //miss two return 101 }
- Constant propagation is now applied to stack array declarations in C; so
a pattern$TYPE $NAME[101];
will now produce two matches in the following snippet:int main() { int bad_len = 101; /* match 1 */ int arr1[101]; /* match 2 */ int arr2[bad_len]; return 0; } ``` (gh-8037)
- Solidity: allow metavariables for version, as in
pragma solidity >= $VER;
(gh-8104) - Added support for parsing patterns of the form
In code such as
#[Attr1] #[Attr2]
Previously, to match against multiple attributes it was required to write#[Attr1] #[Attr2] function test () { echo "Test"; }
#[Attr1, Attr2] ``` (pa-7398)
Release v1.28.0
1.28.0 - 2023-06-21
Added
-
Added lone decorators as a valid Python semgrep pattern, so for example
$NAME($X)
will
generate two seperate findings here:@hello("world") @hi("semgrep!") def shift(): return "left!" ``` (gh-4722)
-
Add tags to the python wheel for 3.10 and 3.11 (gh-8040)
-
JS/TS: Patterns for class properties can now have the
static
andasync
modifiers.For instance:
@Foo(...) async bar(...) { ... }
or
@Foo(...) static bar(...) { ... } ``` (pa-2675)
-
Semgrep Language Server now supports multi-folder workspaces (pa-2772)
-
New pre-commit hook
semgrep-ci
to use CI rules in pre-commit, which will pull from the rule board + block those in the block column (pa-2795) -
Added support for date comparison and functionality to get current date.
Currently this requires date strings to be in the format "yyyy-mm-dd" next step is to support other formats. (pa-7992)
Changed
- The output of
--debug
will be much less verbose by default, it will only show
internal warning and error messages. (debug-1) - Updated the maximum number of cores autodetected to 16 to prevent overloading on large machines when users do not specify number of jobs themselves (pa-2807)
Fixed
-
taint analysis: Improve handling of dataflow for tainted value propagation in class field definitions
This change resolves an issue where dataflow was not correctly accounted for
when tainted values flowed through field definitions in class/object
definitions. For instance, in Kotlin or Scala, singleton objects are commonly
used to encapsulate executable logic, where each field definition behaves like
a statement during object initialization. In order to handle this scenario, we
have introduced an additional step to analyze a sequence of field definitions
as a sequence of statements for taint analysis. This enhancement allows us to
accurately track tainted values during object initialization. (gh-7742) -
Allow any characters in file paths used to create dotted rule IDs. File path
characters that aren't allowed in rule IDs are simply removed. For example, a
rule whose ID ismy-rule
found in the filehello/@world/rules.yaml
becomeshello.world.my-rule
. (gh-8057) -
Diff aware scans now work when git state isn't clean (pa-2795)
Release v1.27.0
1.27.0 - 2023-06-13
Added
- PHP: Added composer ecosystem parser (gh-7734)
- Pro: taint-mode: Java: Semgrep can now relate Java properties and their corresponding
getters/setters even when these are autogenerated (so the actual getters/setters are
not declared in the sources). (pa-2833)
Fixed
- semgrep-core now validates rule IDs. This should not affect users since rule
ID validation is done by the Python wrapper. (gh-8026)
Release v1.26.0
1.26.0 - 2023-06-09
Added
- In Java, Semgrep can now track taint through more getters and setters. It could already relate setters to getters (e.g.
o.setX(taint); o.getX()
but now it can relate setters and getters to properties (e.g.o.setX(taint); o.x
). (getters) - taint-mode: Added experimental options
taint_assume_safe_booleans
and
taint_assume_safe_numbers
to avoid propagating taint coming from expressions
with Boolean or number (integer, float) types. (pa-2777)
Fixed
- swift: Support if let shorthand for shadowing an existing optional variable. (gh-7583)
- Elixir: fix the string extraction used for -filter_irrelevant_rules (gh-7855)
- Fixed comparison of taint information that was causing duplicate taints to be tracked.
Interfile analysis on large repos will see a small speedup. (misc-1) - taint-mode: Fixed performance regression in 1.24.0 that affected taint rules. (pa-2777-1)
- Fix a recent regression that caused failures to match in certain cases that combined metavariable-regex and typed metavariables which themselves contain metavariables (e.g. in Go
($X: $T)
with ametavariable-regex
operating on$T
). (pa-2822) - Gomod comments: fix parsing comments that end in ')' (sc-716)
Release v1.25.0
1.25.0 - 2023-06-06
Added
- aliengrep: new option 'generic_caseless' to achieve case-insensitive matching (gh-7883)
- Semgrep now includes heuristics based on the Java standard library and common naming patterns. These allow Semgrep to determine the types of more expressions in Java, for use with typed metavariables (https://semgrep.dev/docs/writing-rules/pattern-syntax/#typed-metavariables). (heuristics)
- Language server now supports search (and replace) with semgrep patterns through semgrep/search (ls-search)
- Language Server will now notify users of errors, and reason for crash (pa-2791)
Fixed
- Pro (taint analysis): Check function calls without parameters or parenthesis in Ruby (gh-7787)
- Aliengrep: ellipsis patterns that would be useless because of being placed
at the extremity of a pattern (always) or a line (in single-mode) are now
anchored to the beginning/end of input/line. For example,...
in multiline
mode matches the whole input rather than matching nothing many times. (gh-7881) - Fixed bug in constant propagation that made Semgrep fail to compute the value of
an integer constant when this was obtained via the multiplication of two other
constants. (gh-7893) - Fix regexps potentially vulnerable to ReDoS attacks in Python code for parsing
git URLs. Sets maximum length of git URLs to 1024 characters since parsing is
still perceptibly slow on 5000-byte input. Reported by Sebastian Chnelik,
PyUp.io. (gh-7943)
Release v1.24.1
1.24.1 - 2023-06-01
Fixed
- Yarn v1: fix parsing for package headers without version constraints (sc-749)
Release v1.24.0
1.24.0 - 2023-05-31
Added
-
New experimental aliengrep engine that can be used as an alternative to the
default spacegrep engine withoptions.generic_engine: aliengrep
. (aliengrep) -
Pro: Taint labels now mostly work interprocedurally, except for labeled propagators.
Note that taint labels are experimental! (pa-2507) -
Pro: Taint-mode now supports inter-procedural field-sensitivity for JS/TS.
For example, given this class:
class Obj { constructor(x, y) { this.x = x; this.y = y; } }
Semgrep knows that an object constructed by
new Obj("tainted", "safe")
has its
x
attribute tainted, whereas itsy
attribute is safe. (pa-2570)
Changed
- Set limits to the amount of taint that is tracked by Semgrep to prevent perf
issues. (pa-2570)
Fixed
- Allow symbolic propagation for rvals in lhs of assignments. (gh-6780)
- XML: you can now use metavariable-comparison on XML attributes or XML text body (gh-7709)
- Java: support for record patterns (gh-7911)
- C#: support ellipsis in enum declarations (gh-7914)
- Fixed a recent regression which caused typed metavariables to fail to match when
the type itself also contained a metavariable, and the target was a builtin
type. For example, the pattern(List<$T> $X)
would fail to match a value of
typeList<String>
. (typed-mvar)
Release v1.23.0
1.23.0 - 2023-05-24
Added
-
On scan complete during logged in
semgrep ci
scans, check returned exit code to
see if should block scans. This is to support incoming features that requires
information from semgrep.dev (complete) -
Extract mode: users can now choose to include or exclude rules to run on, similar to
paths:
. For example,
to only run on the rulesexample-1
andexample-2
, you would writerules: - id: test-rule mode: extract rules: include: - example-1 - example-2
To run on everything except
example-1
andexample-2
, you would writerules: - id: test-rule mode: extract rules: exclude: - example-1 - example-2 ``` (gh-7858)
-
Kotlin: Added literal metavariables, from patterns like
"$FOO"
.
You can still match strings that only contain a single interpolated
ident by using the brace notation, e.g."${FOO}"
. (pa-2755) -
Increase timeout of
semgrep ci
upload findings network calls
and make said timeout configurable with env var SEMGREP_UPLOAD_FINDINGS_TIMEOUT (timeout)
Changed
-
Relaxed restrictions on symbolic propagation so that symbolic values survive
branching statements. Now (with symbolic-propagation enabled)foo(bar())
will
match match the following code:def test(): x = bar() if cond: exit() foo(x)
Previously any symbolically propagated value was lost after any kind of branching
statement. (pa-2739)
Fixed
- swift: support ellipsis metavariable (gh-7666)
- Scala: You can now put an ellipsis inside of a
catch
, to
write a pattern like:
try {
...
} catch {
...
}
which will match every kind of try-catch. (gh-7807) - When scanning with
-l dockerfile
, files nameddockerfile
as well asDockerfile
will be scanned. (gh-7824) - Fix for very long runtimes that could happen due to one of our optimizations. We now detect when that might
happen and skip the optimization. (gh-7839) - Improve type inference for some simple arithmetic expressions (inference)
- Fixed bug introduced in 1.19.0 that was causing some stack overflows. (pa-2740)
Release v1.22.0
1.22.0 - 2023-05-15
Added
- Add support for language Cairo 1.0 (develop). Thanks to Frostweeds (Romain Jufer) for his contribution! (gh-7757)
- On logged in
semgrep ci
scans, report lockfile parse errors to display in webUI (lockfileparse) - Pro: Java: Taint-mode can now do field-sensitive analysis of class constructors.
For example, if the default constructor of a classC
sets its fieldx
to a
tainted value, giveno = new C()
, Semgrep will know thato.getX()
is tainted. (pa-2570) - Kotlin: Added named ellipses, like $...X (pa-2710)
- Kotlin: Interpolated identifiers in strings, such as "$foo", are now properly
able to match explicitly interpolated expressions, like "${...}". (pa-2711)
Changed
- Cleanup: Removed Bloom filter optimization. This optimization had been turned off by
default since September 2022 (release 0.116.0) without any noticeable effect. It had
its role in the past when it was first introduced, but now it's time for it to go! (cleanup-1) - engine: The use of a matching cache for statements is now disabled by default,
please let us know if you notice any performance degradation. We plan to remove
this optimization in a few weeks. (cleanup-2)
Fixed
- Enable automatic removal of matched codes by allowing an empty string in the fix field. (gh-6318)
- Updated SARIF to use nested levels, added confidence to tags and included references with markdown links. (gh-7317)
- taint-mode: Fixed bug in taint labels that was causing some fatal errors:
Failure "Call AST_utils.with_xxx_equal to avoid this error." (gh-7694)