Skip to content

Releases: semgrep/semgrep-interfaces

Release v1.65.0

11 Mar 19:03
3e7bbaf
Compare
Choose a tag to compare

1.65.0 - 2024-03-11

Changed

  • Removed the extract-mode rules experimental feature. (extract_mode)

Release v1.64.0

07 Mar 05:08
13fe14d
Compare
Choose a tag to compare

1.64.0 - 2024-03-07

Changed

  • Removed the AST caching experimental feature (--experimental --ast-caching
    in osemgrep and -parsing_cache_dir in semgrep-core). (ast_caching)
  • Removed the Registry caching experimental feature (--experimental --registry-caching)
    in osemgrep. (registry_caching)

Fixed

  • Clean any credentials from project URL before using it, to prevent leakage. (saf-876)
  • ci: Updated logic for informational message printed when no rules are sent to
    correctly display when secrets is enabled (in additional to
    when code is). (scrt-455)

Release v1.63.0

27 Feb 16:52
8751faa
Compare
Choose a tag to compare

1.63.0 - 2024-02-27

Added

  • Dataflow: Added support for nested record patterns such as { body: { param } }
    in the LHS of an assignment. Now given { body: { param } } = tainted Semgrep
    will correctly mark param as tainted. (flow-68)
  • Matching: metavariable-regex can now match on metavariables of interpolated
    strings which use variables that have known values. (saf-865)
  • Add support for parsing Swift Package Manager manifest and lockfiles (sc-1217)

Fixed

  • fix: taint signatures do not capture changes to parameters' fields (flow-70)
  • Scan summary links printed after semgrep ci scans now reflect a custom SEMGREP_APP_URL, if one is set. (saf-353)

Release v1.62.0

22 Feb 20:54
bbfd1c5
Compare
Choose a tag to compare

1.62.0 - 2024-02-22

Added

  • Pro: Adds support for python constructors to taint analysis.

    If interfile naming resolves that a python constructor is called taint
    will now track these objects with less heuristics. Without interfile
    analysis these changes have no effect on the behavior of tainting.
    The overall result is that in the following program the oss analysis
    would match both calls to sink while the interfile analysis would only
    match the second call to sink.

    class A:
        untainted = "not"
        tainted = "not"
        def __init__(self, x):
        	self.tainted = x
    
    a = A("tainted")
    # OK:
    sink(a.untainted)
    # MATCH:
    sink(a.tainted)
    ``` (ea-272)
    
  • Pro: taint-mode: Added basic support for "index sensitivity", that is,
    Semgrep will track taint on individual indexes of a data structure when
    these are constant values (integers or strings), and the code uses the
    built-in syntax for array indexing in the corresponding language
    (typically E[i]). For example, in the Python code below Semgrep Pro
    will not report a finding on sink(x) or sink(x[1]) because it will
    know that only x[42] is tainted:

    x[1] = safe
    x[42] = source()
    sink(x) // no more finding
    sink(x[1]) // no more finding
    sink(x[42]) // finding
    sink(x[i]) // finding

    There is still a finding for sink(x[i]) when i is not constant. (flow-7)

Changed

  • taint-mode: Added exact: false sinks so that one can specify that anything
    inside a code region is a sink, e.g. if (...) { ... }. This used to be the
    semantics of sink specifications until Semgrep 1.1.0, when we made sink matching
    more precise by default. Now we allow reverting to the old semantics.

    In addition, when exact: true (the default), we simplified the heuristic used
    to support traditional sink(...)-like specs together with the option
    taint_assume_safe_functions: true, now we will consider that if the spec
    formula is not a patterns with a focus-metavarible, then we must look for
    taint in the arguments of a function call. (flow-1)

  • The project name for repos scanned locally will now be local_scan/<repo_name> instead
    of simply <repo_name>. This will clarify the origin of those findings. Also, the
    "View Results" URL displayed for findings now includes the repository and branch names. (saf-856)

Fixed

  • taint-mode: experimental: For now Semgrep CLI taint traces are not adapted to
    support multiple labels, so Semgrep picks one arbitrary label to report, which
    sometimes it's not the desired one. As a temporary workaround, Semgrep will
    look at the requires of the sink, and if it has the shape A and ..., then
    it will pick A as the preferred label and report its trace. (flow-65)
  • Fixed trailing newline parsing in pyproject.toml and poetry.lock files. (gh-9777)
  • Fixed an issue that led to incorrect autofix application in certain cases where multiple fixes were applied to the same line. (saf-863)
  • The tokens for type parameters brackets are now stored in the generic AST allowing
    to correctly autofix those constructs. (tparams)

Release v1.61.1

14 Feb 19:51
bbfd1c5
Compare
Choose a tag to compare

1.61.1 - 2024-02-14

Added

  • Added performance metrics using OpenTelemetry for better visualization.
    Users wishing to understand the performance of their Semgrep scans or
    to help optimize Semgrep can configure the backend collector created in
    libs/tracing/unix/Tracing.ml.

    This is experimental and both the implementation and flags are likely to
    change. (ea-320)

  • Created a new environment variable SEMGREP_REPO_DISPLAY_NAME for use in semgrep CI.
    Currently, this does nothing. The goal is to provide a way to override the display
    name of a repo in the Semgrep App. (gh-8953)

  • The OCaml/C executable (semgrep-core or osemgrep) is now passed through
    the strip utility, which reduces its size by 10-25% depending on the
    platform. Contribution by Filipe Pina (@fopina). (gh-9471)

Changed

  • "Missing plugin" errors (i.e., rules that cannot be run without --pro) will now
    be grouped and reported as a single warning. (ea-842)

Release v1.60.1

09 Feb 09:02
eed58a0
Compare
Choose a tag to compare

1.60.1 - 2024-02-09

Added

  • Rule syntax: Metavariables by the name of $_ are now anonymous, meaning that
    they do not unify within a single pattern or across patterns, and essentially
    just unconditionally specify some expression.

    For instance, the pattern foo($_, $_) may match the code foo(1, 2).

    This will change the behavior of existing rules that use the metavariable
    $_, if they rely on unification still happening. This can be fixed by simply
    giving the metavariable a real name like $A. (ea-837)

  • Added infrastructure for semgrep supply chain in semgrep-core. Not fully functional yet. (ssc-port)

Changed

  • Dataflow: Simplified the IL translation for Python with statements to let
    symbolic propagation assume that with foo() as x: ... entails x = foo(),
    so that e.g. Session().execute("...") matches:

    with Session() as s:
        s.execute("SELECT * from T") (CODE-6633)
    

Fixed

  • Output: Semgrep CLI now no longer sometimes interpolated metavariables twice, if
    the message that was substituted for a metavariable itself contained a valid
    metavariable to be interpolated (ea-838)

Release v1.60.0

08 Feb 18:28
eed58a0
Compare
Choose a tag to compare

1.60.0 - 2024-02-08

Added

  • Rule syntax: Metavariables by the name of $_ are now anonymous, meaning that
    they do not unify within a single pattern or across patterns, and essentially
    just unconditionally specify some expression.

    For instance, the pattern foo($_, $_) may match the code foo(1, 2).

    This will change the behavior of existing rules that use the metavariable
    $_, if they rely on unification still happening. This can be fixed by simply
    giving the metavariable a real name like $A. (ea-837)

  • Added infrastructure for semgrep supply chain in semgrep-core. Not fully functional yet. (ssc-port)

Fixed

  • Output: Semgrep CLI now no longer sometimes interpolated metavariables twice, if
    the message that was substituted for a metavariable itself contained a valid
    metavariable to be interpolated (ea-838)

Release v1.59.1

02 Feb 17:07
2c2dd92
Compare
Choose a tag to compare

1.59.1 - 2024-02-02

Added

  • taint-mode: Pro: Semgrep can now track taint via static class fields and global
    variables, such as in the following example:

    static char* x;
    
    void foo() {
        x = "tainted";
    }
    
    void bar() {
        sink(x);
    }
    
    void main() {
        foo();
        bar();
    }
    ``` (pa-3378)
    

Fixed

  • Pro: Make inter-file analysis more tolerant to small bugs, resorting to graceful
    degradation and continuing with the scan, rather than crashing. (pa-3387)

Release v1.59.0

30 Jan 12:47
32488fd
Compare
Choose a tag to compare

1.59.0 - 2024-01-30

Added

  • Swift: Now supports typed metavariables, such as ($X : ty). (pa-3370)

Changed

  • Add Elixir to Pro languages list in help information. (gh-9609)

  • Removed sg alias to avoid naming conflicts
    with the shadow-utils sg command for Linux systems. (gh-9642)

  • Prevent unnecessary computation when running scans without verbose logging enabled (gh-9661)

  • Deprecated option taint_match_on introduced in 1.51.0, it is being renamed
    to taint_focus_on. Note that taint_match_on was experimental, and
    taint_focus_on is experimental too. Option taint_match_on will continue
    to work but it will be completely removed at some point after 1.63.0. (pa-3272)

  • Added information on product-related flags to help output, especially for Semgrep Secrets. (pa-3383)

  • taint-mode: Improve inference of best matches for exact-sources, exact-sanitizers,
    and sinks. Now we also avoid FPs in cases such as:

    dangerouslySetInnerHTML = {
      // ok:
      {__html: props ? DOMPurify.sanitize(props.text) : ''} // no more FPs!
    }
    

    where props is tainted and the sink specification is:

    patterns:
      - pattern: |
         dangerouslySetInnerHTML={{__html: $X}}
      - focus-metavariable: $X
    

    Previously Semgrep wrongly considered the individual subexpressions of the
    conditional as sinks, including the props in props ? ..., thus producing a
    false positive. Now it will only consider the conditional expression as a whole
    as the sink. (rules-6457)

  • Removed an internal legacy syntax for secrets rules (mode: semgrep_internal_postprocessor). (scrt-320)

Fixed

  • Autofix: Fixes that span multiple lines will now try to align
    inserted fixed lines with each other. (gh-3070)

  • Matching: Try blocks with catch clauses can now match try blocks that have
    extraneous catch clauses, as long as it matches a subset. For instance,
    the pattern

    try:
      ...
    catch A:
      ...
    

    can now match

    try:
      ...
    catch A:
      ...
    catch B:
      ...
    ``` (gh-3362)
    
  • Previously, some people got the error:

    Encountered error when running rules: Other syntax error at line NO FILE INFO YET:-1:
    Invalid_argument: String.sub / Bytes.sub
    

    Semgrep should now report this error properly with a file name and line number and
    handle it gracefully. (gh-9628)

  • Fixed Dockerfile parsing bug where multiline comments were parsed incorrectly. (gh-9628-2)

  • The language server will now properly respect findings that have been ignored via the app (lsp-fingerprints)

  • taint-mode: Pro: Semgrep will now propagate taint via instance variables when
    calling methods within the same class, making this example work:

    class Test {
    
      private String str;
    
      public setStr() {
        this.str = "tainted";
      }
    
      public useStr() {
        //ruleid: test
        sink(this.str);
      }
    
      public test() {
        setStr();
        useStr();
      }
    
    }
    ``` (pa-3372)
  • taint-mode: Pro: Taint traces will now reflect when taint is propagated via
    class fields, such as in this example:

    class Test {
    
      private String str;
    
      public setStr() {
        this.str = "tainted";
      }
    
      public useStr() {
        //ruleid: test
        sink(this.str);
      }
    
      public test() {
        setStr();
        useStr();
      }
    
    }

    Previously Semgrep will report that taint originated at this.str = "tainted",
    but it would not tell you how the control flow got there. Now the taint trace
    will indicate that we get there by calling setStr() inside test(). (pa-3373)

  • Addressed an issue related to matching top-level identifiers with meta-variable
    qualified patterns in C++, such as matching ::foo with ::$A::$B. This problem
    was specific to Pro Engine-enabled scans. (pa-3375)

Release v1.58.0

23 Jan 01:50
4cc11b0
Compare
Choose a tag to compare

1.58.0 - 2024-01-23

Added

  • Added a severity icon (e.g. "❯❯❱") and corresponding color to our CLI text output
    for findings of known severity. (grow-97)

  • Naming has better support for if statements. In particular, for
    languages with block scope, shadowed variables inside if-else blocks
    that are tainted won't "leak" outside of those blocks.

    This helps with features related to naming, such as tainting.

    For example, previously in Go, the x in sink(x) will report
    that x is tainted, even though the x that is tainted is the
    one inside the scope of the if block.

    func f() {
      x := "safe";
      if (c) {
        x := "tainted";
      }
      // x should not be tainted
      sink(x);
    }

    This is now fixed. (pa-3185)

  • OSemgrep can now scan remote git repositories. Pass --experimental --pro --remote http[s]://<website>/.../<repo>.git to use this feature (pa-remote)

Changed

  • Rules stored under an "hidden" directory (e.g., dir/.hidden/myrule.yml)
    are now processed when using --config .
    We used to skip dot files under dir, but keeping rules/.semgrep.yml,
    but not path/.github/foo.yml, but keeping src/.semgrep/bad_pattern.yml
    but not ./.pre-commit-config.yaml, ... This was mainly because
    we used to fetch rules from ~/.semgrep/ implicitely when --config
    was not given, but this feature was removed, so now we can keep it simple. (hidden_rules)
  • Removed support for writing rules using jsonnet. This feature
    will be restored once we finish the port to OCaml of the semgrep CLI. (jsonnet)
  • The primitive object construct expression will no longer match the new
    expression pattern. For example, the pattern new $TYPE will now only match
    new int, not int(). (pa-3336)
  • The placement new expression will no longer match the new expression without
    placement. For instance, the pattern new ($STORAGE) $TYPE will now only match
    new (storage) int and not new int. (pa-3338)

Fixed

  • Java: You can now use metavariable ellipses properly in
    function arguments, as statements, and as expressions.

    For instance, you may write the pattern

    public $F($...ARGS) { ... }
    ``` (gh-9260)
    
  • Nosemgrep: Fixed a bug where Semgrep would err upon reading a nosemgrep
    comment with multiple rule IDs. (gh-9463)

  • Fixed bugs in gitignore/semgrepignore globbing implementation affecting --experimental. (gh-9544)

  • Fixed rule IDs, descriptions, findings, and autofix text not wrapping as expected.
    Use newline instead of horiziontal separator for findings with a shared file
    but for different rules per design spec. (grow-97)

  • Keep track of the origin of return; statements in the dataflow IL so that
    recently added (Pro-only) at-exit: true sinks work properly on them. (pa-3337)

  • C++: Improve translation of delete expressions to the dataflow IL so that
    recently added (Pro-only) at-exit: true sinks work on them. Previously
    delete expression at "exit" positions were not being properly recognized
    as such. (pa-3339)

  • cli: fix python runtime error with 0 width wrapped printing (pa-3366)

  • Fixed a bug where Gemfile.lock files with multiple GEM sections
    would not be parsed correctly. (sc-1230)