Skip to content

Lombok-generated method incorrectly indexed #788

@stevev-neosec

Description

@stevev-neosec

Using Docker image retrieved with docker pull --platform=linux/arm64 sourcegraph/scip-java:0.10.4.

Indexing project - https://github.com/hendisantika/spring-boot-swagger
project hash: bf21451 and possibly others

Generated index (text format): p1.java.scip.txt.json

In the Student class, all getters and setters are generated by Lombok via @Data. For all of the getters, the SCIP index defines a multiline occurrence with `symbol_roles:

EXCEPT for getAddress().

Example:

  occurrences {
    range: 26
    range: 24
    range: 27
    range: 5
    symbol: "semanticdb maven maven/com.hendisantika.springboot.swagger/springboot-swagger 0.0.1-SNAPSHOT com/hendisantika/springboot/swagger/model/Student#getAge()."
    symbol_roles: 1
  }
  ...
  symbols {
    symbol: "semanticdb maven maven/com.hendisantika.springboot.swagger/springboot-swagger 0.0.1-SNAPSHOT com/hendisantika/springboot/swagger/model/Student#getAge()."
    kind: Method
    display_name: "getAge"
    signature_documentation {
      relative_path: "src/main/java/com/hendisantika/springboot/swagger/model/Student.java"
      language: "java"
      text: "@SuppressWarnings(\"all\")\n@Generated\npublic Integer getAge()"
    }
  }
  ...
  occurrences {
    range: 34
    range: 25
    range: 35
    symbol: "semanticdb maven maven/com.hendisantika.springboot.swagger/springboot-swagger 0.0.1-SNAPSHOT com/hendisantika/springboot/swagger/model/Student#getAddress()."
    symbol_roles: 1
  }
  ...
  symbols {
    symbol: "semanticdb maven maven/com.hendisantika.springboot.swagger/springboot-swagger 0.0.1-SNAPSHOT com/hendisantika/springboot/swagger/model/Student#getAddress()."
    kind: Method
    display_name: "getAddress"
    signature_documentation {
      relative_path: "src/main/java/com/hendisantika/springboot/swagger/model/Student.java"
      language: "java"
      text: "@SuppressWarnings(\"all\")\n@Generated\npublic Address getAddress()"
    }
  }

In the case of the two problematic endpoints, each of those happens to call the getAddress() method in a streaming filter:

.filter(e -> e.getValue().getAddress().getCity().equals(cityName))

If you look at the example SCIP occurrences above, you see that the range for getAddress() exactly matches the location in the original source of getAddress() within the line this.address = p.getAddress();. The SCIP index is incorrect because this is not the line on which the method is defined, it is a line on which it is referenced.

Note that this is not a problem for getAge(). If you look at it’s defining occurrence above, you’ll see that it has a multiline range.

One additional use case was tested by changing the above .filter example to:

.filter(e -> e.getValue().getFirstName().equals(cityName))

The SCIP index was re-generated and the resulting indexing did not have any issues.

For some reason this was specific to the getAddress() method.

Activity

olafurpg

olafurpg commented on May 2, 2025

@olafurpg
Contributor

Thank you for reporting! I just indexed this repo with 0.10.4, ran scip snapshot, and there's a lot of problematic output from the large number of generated symbols by the annotation processors. https://gist.github.com/olafurpg/98a3f9f687b68934e2562c6d9e6f40c1

The getAddress occurrence has a reference to jfairy symbol and a definition to a springboot-swagger symbol

         this.address = p.getAddress();
//             ^^^^^^^ reference semanticdb maven maven/com.hendisantika.springboot.swagger/springboot-swagger 0.0.1-SNAPSHOT com/hendisantika/springboot/swagger/model/Student#address.
//                       ^ reference local 1
//                         ^^^^^^^^^^ reference semanticdb maven maven/org.jfairy/jfairy 0.3.0 org/jfairy/producer/person/Person#getAddress().
//                         ^^^^^^^^^^ definition semanticdb maven maven/com.hendisantika.springboot.swagger/springboot-swagger 0.0.1-SNAPSHOT com/hendisantika/springboot/swagger/model/Student#getAddress().

This repo has test cases for lombok, you can try to minimize this example to see if we can reproduce it in the test suite. That would be the best way to get this fixed.

stevev-neosec

stevev-neosec commented on May 2, 2025

@stevev-neosec
Author

Hi Ólafur,

Thank you for the quick response to this issue.

I'm not sure how to best "minimize this example" but I've done what I can.

I'm attaching the streamlined version of the project in the attached "1.zip" file.

1.zip

What's provided in the zip archive will allow A-B testing, to demonstrate the issue, then to demonstrate the change that removes the issue. Ultimately it appears the SCIP indexer is confusing the Person.getAddress() symbol (from JFairy) and the Student.getAddress() symbol, though I don't know why that didn't occur with any of the other like-named symbols between Person and Student in the original project (and which I have removed in the "1.zip" version).

To do the A-B testing yourself, index the project as-is. That will recreate the issue, and you should get a SCIP file that matches "issue.java.scip" in the project root (also see "issue.java.scip.json" that was generated with protoc).

Next, rename address in Student to addr. Ensure corresponding updates in the Student constructor (the one assignment should now be this.addr = p.getAddress();) and also in StudentService.filterByCity(...) where the lambda should now read e -> e.getValue().getAddr().getCity()...... Then re-index and you should get a SCIP file that matches "noissue.java.scip" in the project root (also see "noissue.java.scip.json" that was generated with protoc).

In "issue.java.scip.json" you'll see the following occurrences where both the Person#getAddress(). occurrence and defining occurrence (symbol_roles: 1) for Student#getAddress(). have the same range:

  occurrences {
    range: 11
    range: 25
    range: 35
    symbol: "semanticdb maven maven/org.jfairy/jfairy 0.3.0 org/jfairy/producer/person/Person#getAddress()."
  }
  occurrences {
    range: 11
    range: 25
    range: 35
    symbol: "semanticdb maven maven/com.hendisantika.springboot.swagger/springboot-swagger 0.0.1-SNAPSHOT com/hendisantika/springboot/swagger/model/Student#getAddress()."
    symbol_roles: 1
  }

However, after renaming Student.address to Student.addr and re-indexing, as indicated above, you'll see something like what's in "noissue.java.scip.json". Note that range for Student#getAddr(). no longer matches that of Person#getAddress().:

  occurrences {
    range: 11
    range: 22
    range: 32
    symbol: "semanticdb maven maven/org.jfairy/jfairy 0.3.0 org/jfairy/producer/person/Person#getAddress()."
  }
  occurrences {
    range: 8
    range: 25
    range: 10
    range: 5
    symbol: "semanticdb maven maven/com.hendisantika.springboot.swagger/springboot-swagger 0.0.1-SNAPSHOT com/hendisantika/springboot/swagger/model/Student#getAddr()."
    symbol_roles: 1
  }

I hope this helps you to narrow down the root cause, develop a fix, and create an automated test.

Thank you,
Steve

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @olafurpg@stevev-neosec

        Issue actions

          Lombok-generated method incorrectly indexed · Issue #788 · sourcegraph/scip-java