Skip to content

Empty scan results from failed runs are stored and reused even if the failure cause gets fixed and a new scan is triggered #10054

Open
@alpianon

Description

@alpianon

Describe the bug

When scancode completely fails to scan a package for whatever reason, no meaningful log message is shown and the processing goes on, until ORT finds that the temporary report.json output file from scancode is not there and throws a FileNotFoundException error, like in:

11:40:01.114 [main] ERROR org.ossreviewtoolkit.scanner.PathScannerWrapper - Failed to scan
RepositoryProvenance(vcsInfo=VcsInfo(type=Git, url=https://github.com/flozz/StackBlur.git, 
revision=1b85fe57ae5c2e7beeff157e1f9f7c7a7082f537, path=), resolvedRevision=1b85fe57ae5c2e7beeff157e1f9f7c7a7082f537)
with path scanner 'ScanCode': FileNotFoundException: /tmp/ort-ScanCode16441962167167250959/result.json (No such file or 
directory)

What is worse, the result with the generic error gets saved in ~/.ort/scanner/artifact/<name>/scan-results.yml, so it gets reused and the corresponding package never gets scanned again even if you fix the problem you had with ScanCode, and you will continue seeing the same error in the next runs of ort scan, unless you delete all the (wrong) cached results.

To Reproduce

Steps to reproduce the behavior:

  1. force scancode to fail, f.e. add a wrong configuration entry for scancode in config.yml, like:
ort:
    config:
      ScanCode:
        options:
          commandLine: '--copyright --license --info --strip-root --timeout 600'

(since ORT 52.0.0, the commandLine field should contain parameters separated by comma and not by space; the correct value would be '--copyright,--license,--info,--strip-root,--timeout,600')

  1. launch ort scan with the relevant options for your case
  2. check the output log and the result file and see the generic error above
  3. fix the scancode issue (in the example above, you should correct the commandLine value replacing spaces with commas)
  4. re-run ort scan with the same parameters, check the result file and see that the errors are still there
  5. delete all cached results with generic errors:
to_delete=$(find ~/.ort/scanner -name *.yml -exec grep -l FileNotFoundException {} \;)
rm $to_delete
  1. re-run ort-scan with the same parameters; check the result file and the errors should be gone

Expected behavior

ORT should detect when ScanCode completely fails to run, and it should report the error message it gets from ScanCode instead of ignoring the error and throwing a generic FileNotFoundException when it does not find the report. Otherwise when ScanCode fails for whatever reason, it is impossible to analytically debug the error and one may just guess.

Moreover, in case ScanCode completely fails to run, no result should be cached.

Console / log output

see above

Environment

  • ORT version: 54.0.0
  • Java version: 21
  • OS: Linux

But it should affect any previous version of ORT

Metadata

Metadata

Assignees

No one assigned

    Labels

    scannerAbout the scanner tool

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions