Skip to content

bug(java): Maven 429 rate limit is reported as semaphore acquire: context canceled instead of the real error #10792

@DmitriyLewen

Description

@DmitriyLewen

Description

In #10691 (fixed by #10693) we changed the behavior on a remote Maven 429 Too Many Requests: instead of only logging it at DEBUG and silently skipping the dependency, Trivy now stops the scan with a concrete, actionable error.

However, in some cases — large multi-module Maven projects — the user does not see that 429 error but a generic context canceled instead:

FATAL  Fatal error  run error: fs scan error: scan error: scan failed: failed analysis:
analyze with traversal: walk dir error: unknown error with docs/maven-plugin/pom.xml:
failed to analyze file: analyze file (docs/maven-plugin/pom.xml): semaphore acquire:
context canceled

The actual cause — the remote 429 Too Many Requests — is only visible at --debug:

[pom] Failed to fetch url="https://repo.maven.apache.org/maven2/org/codehaus/groovy/groovy-all/3.0.25/groovy-all-3.0.25.pom" err="Get: context canceled"

So for multi-module projects the improvement from #10693 is effectively lost: the user gets context canceled instead of the actionable rate-limit message.

Reproduction

git clone https://github.com/keycloak/keycloak
cd keycloak
trivy fs --scanners vuln .

Reproduces when Maven Central rate-limits the IP (the repo has many pom.xml files, so the walk is still running when the 429 fires). The failing file in the error varies between runs (e.g. docs/maven-plugin/pom.xml), since it depends on timing.

Expected behavior

The scan stops with the actionable rate-limit error introduced in #10693:

FATAL  Error  remote Maven repository returned 429 Too Many Requests for https://repo.maven.apache.org/maven2/.../*.pom. Retry-After: <n>.
The repository blocks all subsequent requests from this IP until the block clears.
To avoid this, populate the local Maven cache before scanning (e.g. run `mvn dependency:resolve` and cache ~/.m2 in CI).

Actual behavior

The real *types.UserError (429) is masked by a generic semaphore acquire: context canceled.

Root cause

Artifact analysis runs file analyzers through errgroup.WithContext, while the synchronous file walk uses the group's egCtx for limit.Acquire. When the pom analyzer goroutine returns the 429 *types.UserError, errgroup cancels egCtx; the still-running walk then fails its next Acquire with context.Canceled, and that walk error is returned before eg.Wait() — which is where the real error lives.

The masking has been latent since the sync.WaitGrouperrgroup.WithContext migration (#9538) and became observable once #10693 made the pom analyzer return a fatal *types.UserError. The same pattern exists in the local (fs), image and vm artifacts. A single-pom.xml project surfaces the 429 correctly because the walk finishes before the cancellation.

Version

0.71.0 

Metadata

Metadata

Assignees

Labels

kind/bugCategorizes issue or PR as related to a bug.scan/vulnerabilityIssues relating to vulnerability scanningtarget/filesystemIssues relating to filesystem scanningtarget/repositoryIssues relating to VCS repository scanning

Type

No type
No fields configured for issues without a type.

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions