Skip to content

feat(mvn): add Maven (Java) filter module — test, compile, checkstyle:check, dependency:tree#1089

Open
mariuszs wants to merge 32 commits into
rtk-ai:developfrom
mariuszs:feat/mvn-rust-module
Open

feat(mvn): add Maven (Java) filter module — test, compile, checkstyle:check, dependency:tree#1089
mariuszs wants to merge 32 commits into
rtk-ai:developfrom
mariuszs:feat/mvn-rust-module

Conversation

@mariuszs
Copy link
Copy Markdown

@mariuszs mariuszs commented Apr 8, 2026

Summary

Maven/Java ecosystem support for RTK.

Subcommands

Command Savings
rtk mvn test 99%
rtk mvn verify 95%
rtk mvn compile 85%
rtk mvn checkstyle:check 90%
rtk mvn dependency:tree 70%
rtk mvn clean 95%

Plus rtk mvnd — same filters for Maven Daemon. Unrecognized goals pass through with metrics.

For test/verify, RTK reads target/surefire-reports/ and target/failsafe-reports/
XML to surface structured failure details (class, method, exception, message) with
compact stack traces — Caused-by chains preserved, framework frames
(org.junit.*, java.base/*, org.apache.maven.*) collapsed to keep app frames
visible. Auto-detects mvnw wrapper.

Test plan

  • cargo fmt --all && cargo clippy --all-targets && cargo test --all passes
  • Snapshot tests (insta) for every filter
  • Token savings verified on real Maven fixtures
  • Manual test on real Maven project

@CLAassistant
Copy link
Copy Markdown

CLAassistant commented Apr 8, 2026

CLA assistant check
All committers have signed the CLA.

@mariuszs mariuszs changed the base branch from master to develop April 8, 2026 20:11
@mariuszs mariuszs force-pushed the feat/mvn-rust-module branch 9 times, most recently from 5d7c9a2 to 8baf6bd Compare April 14, 2026 12:46
@mariuszs mariuszs changed the title feat(mvn): add Maven filter module — test, build, dependency:tree feat(mvn): add Maven (Java) filter module — test, build, dependency:tree Apr 14, 2026
@mariuszs mariuszs force-pushed the feat/mvn-rust-module branch from 6edc416 to b44c8f7 Compare April 14, 2026 13:10
@mariuszs mariuszs marked this pull request as draft April 14, 2026 15:23
@mariuszs mariuszs changed the title feat(mvn): add Maven (Java) filter module — test, build, dependency:tree feat(mvn): add Maven (Java) filter module — test, compile, checkstyle:check, dependency:tree Apr 14, 2026
@mariuszs mariuszs force-pushed the feat/mvn-rust-module branch from b604ddb to 80da844 Compare April 14, 2026 19:16
@mariuszs mariuszs marked this pull request as ready for review April 14, 2026 19:26
@mariuszs mariuszs marked this pull request as draft April 15, 2026 20:43
@mariuszs mariuszs force-pushed the feat/mvn-rust-module branch from 80da844 to 91a6534 Compare April 16, 2026 06:55
@mariuszs mariuszs changed the base branch from develop to master April 16, 2026 06:55
@mariuszs mariuszs marked this pull request as ready for review April 16, 2026 07:32
@mariuszs mariuszs force-pushed the feat/mvn-rust-module branch 3 times, most recently from 10fd699 to 98907f1 Compare April 16, 2026 12:37
@Ckram
Copy link
Copy Markdown

Ckram commented Apr 17, 2026

@mariuszs could you add support for mvnd (maven daemon) ?

@mariuszs mariuszs force-pushed the feat/mvn-rust-module branch from 98907f1 to 781ec7b Compare April 17, 2026 14:15
mariuszs added a commit to mariuszs/rtk-java that referenced this pull request Apr 17, 2026
Adds a parallel 'rtk mvnd' subcommand that reuses the mvn filters (test,
compile, checkstyle:check, dependency:tree, passthrough) but invokes the
mvnd binary directly (bypassing mvnw, which the daemon does not use).

- New MvnBinary enum (Mvn auto-detects mvnw; Mvnd always uses mvnd)
- Threaded through run_test/run_compile/run_checkstyle/run_dep_tree/run_other
- Tracking labels keep 'mvn' and 'mvnd' separate in rtk gain
- Discover rule added for mvnd → rtk mvnd rewrites

Addresses review comment on PR rtk-ai#1089.
mariuszs added a commit to mariuszs/rtk-java that referenced this pull request Apr 17, 2026
Adds a parallel 'rtk mvnd' subcommand that reuses the mvn filters (test,
compile, checkstyle:check, dependency:tree, passthrough) but invokes the
mvnd binary directly (bypassing mvnw, which the daemon does not use).

- New MvnBinary enum (Mvn auto-detects mvnw; Mvnd always uses mvnd)
- Threaded through run_test/run_compile/run_checkstyle/run_dep_tree/run_other
- Tracking labels keep 'mvn' and 'mvnd' separate in rtk gain
- Discover rule added for mvnd → rtk mvnd rewrites

Addresses review comment on PR rtk-ai#1089.
@mariuszs mariuszs force-pushed the feat/mvn-rust-module branch from 864932d to 3d993bf Compare April 17, 2026 18:55
mariuszs added 28 commits May 29, 2026 14:36
Addresses review items from sibling PR rtk-ai#368 that apply to this PR:

- P1 #4 (code duplication): run_checkstyle, run_dep_tree, and the
  compile-like runner shared ~15 lines of near-identical cmd construction +
  runner::run_filtered plumbing. Extract run_simple_goal(binary, goal,
  tee_slug, filter, args, verbose) as the shared shell. Drops the
  compile_like_labels indirection (slug lookup now inline in
  run_compile_like). run_tests_like stays separate because XML enrichment
  needs cwd + app_pkgs + closure capture.

- Minor #subcmd_savings: mvn and mvnd discover rules had empty
  subcmd_savings. Populate with measured per-goal ratios (test 99%, verify
  95%, checkstyle 90%, dependency:tree 70%, compile 85%) so `rtk discover`
  reports accurate opportunity per goal instead of the rule-level 90%.
`rtk mvn clean` previously fell through to Passthrough, producing ~33
lines of ANSI-colored startup noise (prefix loading, model warnings,
plugin separators) for a goal whose useful output is a single
`Deleting <path>` line plus BUILD SUCCESS.

Filter now collapses to one line:
  mvn clean: deleted /path/to/target (1.4 s)

Multi-module reactor reports `deleted N targets`; zero-target case says
`nothing to clean`. When clean is combined with a failing goal (e.g.
`mvn clean compile`), `[ERROR]` lines are preserved so the compile
failure reason stays visible. Savings measured at ≥90% on the real
fixture.

- New MvnCommands::Clean variant; dispatch_mvn routes to run_clean via
  the existing run_simple_goal helper.
- Per-goal savings (95%) added to MVN_SUBCMD_SAVINGS for rtk discover.
Multi-module compile output was dominated by the `Reactor Build Order:` block
and per-module `Reactor Summary` status lines (~40 redundant tokens on a
green 5-module build), and javac `[ERROR] <path>:[line,col]` locations were
emitted twice — inline during compilation and again in the trailing help
block — nearly doubling the failure-path output.

Changes in filter_mvn_compile:
- Skip `Reactor Build Order:` block entirely (redundant with per-module
  Building headers that are already stripped).
- Collapse Reactor Summary: emit a compact `Reactor: N modules — M SUCCESS,
  K FAILURE (name, ...)` line only when something failed; all-green reactors
  rely on the surviving `BUILD SUCCESS` line.
- Dedup `[ERROR] <path>:[line,col]` occurrences by (path, line, col). When
  a duplicate fires, swallow the indented `[ERROR] symbol:/location:/
  required:/found:/reason:` context lines that mirror an earlier javac
  explanation emitted without the `[ERROR]` prefix.

Also: broaden `deprecated` check to `deprecat` (catches both `deprecated`
and `deprecation` variants); strip `/pom.xml` path references (previously
only literal `from pom.xml` matched); drop the Help-footer line `For more
information about the errors...` which wasn't in the boilerplate list.

Savings on fixtures adapted from rtk-ai#782 (attribution in filenames):

  mvn_pr782_compile_success_raw  74.6% → 97.8%  (+23pp)
  mvn_pr782_compile_fail_raw     49.1% → 79.7%  (+31pp)
  mvn_pr782_test_pass_raw        98.7% → 98.7%  (unchanged, correct)
  mvn_pr782_test_fail_raw        93.4% → 93.4%  (unchanged, correct)

Four new regression tests validate multi-module accumulation (20 tests
across 6 modules must not report only the first module's count), failure
enumeration uniqueness, reactor collapse, and javac error dedup.
Addresses the simplify review on 4e8db62 — behaviour and test output
unchanged (savings on PR782 fixtures identical: 98.7 / 93.4 / 97.8 / 79.7).

- `result: Vec<String>` → `String` with capacity hint. Previous code
  allocated one heap `String` per kept line and joined at the end; the
  original iterator-chain body (pre-state-machine) was zero-alloc via
  `Vec<&str>`. Incremental append on a preallocated `String` restores
  zero-allocation-per-line in the hot path.
- Dedup key `HashSet<(String, String, String)>` → `HashSet<&str>` keyed
  on the javac error prefix slice (borrows `clean`). Drops 3 String
  allocations per `[ERROR] path:[L,C]` line to 0.
- `reactor_modules: Vec<(String, String)>` → `Vec<(&str, &str)>` tied
  to `clean`'s lifetime. Drops 2 allocations per reactor summary line.
- Gate `COMPILE_ERROR_LOCATION_RE` with `line.starts_with(ERROR_TAG)`
  so the regex doesn't fire on every surviving [INFO]/bare line in a
  green build.
- Swap `COMPILE_ERROR_LOCATION_RE.captures()` for `.find()` — we only
  need the matched substring for the dedup key, not the capture groups.
- `REACTOR_SUMMARY_LINE_RE`: drop the dead `time` capture group
  (`format_reactor_summary` only reads name+status).
- `format_reactor_summary`: use `write!` macro against the output
  `String` instead of `push_str(&format!(...))` — drops 2 intermediate
  allocations on the BUILD FAILURE path.
- Tighten new `is_maven_boilerplate` entry from the generic "For more
  information about" to the Maven-canonical "For more information about
  the errors" — reduces false-positive surface on plugin log lines.
- Move `use std::collections::HashSet` to top-of-file imports (matches
  the rest of the module).
Running our filter against the competing JVM PR's fixtures surfaced real
gaps on a multi-module Spring Boot success build with pgpverify-maven-plugin:
compile savings were only 50.8% (below our 60% target). This pushes them to
98.1% and closes a cosmetic stack-bleed on simple test failures.

filter_mvn_compile:
- Route is_mvn_startup_noise for the banner emitted by `mvn -V`
  (Apache Maven x.y.z, Maven home, Java version, Default locale, OS name)
  and for SLF4J static-binder complaints — previously all kept as bare text.
- Add INFO_NOISE_PATTERNS for pgpverify-maven-plugin (`Verifying`,
  `Key server(s)`, `Create cache directory`, `Artifacts were already
  validated`, `artifact(s) in repository`), maven-resources-plugin
  (`encoding to copy`, `skip non existing resourceDirectory`) and
  maven-checkstyle-plugin clean-audit output (`Starting audit`, `Audit
  done`, `Checkstyle violations`).

REACTOR_BUILD_ORDER_RE:
- Also match the mvn 3.9.x default `<name> <version>` format (version
  token starting with a digit), not only the classic `[pom|jar|war|ear]`
  suffix variant. Previously the per-module listing escaped the Build
  Order collapse.

filter_mvn_tests_with_goal:
- Terminate the current FAILURE stack collection when the next class
  emits its `Running <class>` marker, so the marker does not bleed into
  the failure's details block.

Tests:
- 2 fixtures adapted from rtk-ai#1241 (pgp+multimodule success,
  single-module test failure).
- 3 regression tests pinning ≥85% savings on the compile fixture,
  absence of banner/JVM/plugin noise, and no stack bleed.

Full suite: 1714 passed.
…dvisory

Observed on a real build: a ~100-word maven-resources-plugin `[INFO]`
advisory ("The encoding used to copy filtered properties files have not
been set…") was slipping through the filter because our pattern was
`encoding to copy`, which does not match the `encoding used to copy`
variant. Broaden the substring to `copy filtered` so both forms collapse:

  - "Using 'UTF-8' encoding to copy filtered resources."
  - "The encoding used to copy filtered properties files have not been set…"

Adds a 3-line regression fixture and a test that pins the advisory
disappears while `BUILD SUCCESS` and `Total time` are preserved.

Full suite: 1715 passed.
When `mvn test` is run against a project that fails to compile, the
build never reaches the `T E S T S` marker and our test filter was
returning "mvn test: no tests run" — hiding the real compile errors and
leaving the user with no signal about why the build failed.

filter_mvn_tests_with_goal now checks for `BUILD FAILURE` before
returning the cheerful short-circuit. When present, it falls back to
filter_mvn_compile, which correctly renders the javac error block.

Other early-exit cases (validate-only phases, plugin-only runs that
genuinely produce no tests) still return the concise "no tests run"
line, since those do not carry `BUILD FAILURE`.

Observed on the user's real `api` repo `mvn test` log (180 lines,
1029 tokens):
  before: `mvn test: no tests run`  (hides 18 compile errors)
  after:  67 lines, 338 tokens, 67.2% savings, full error block shown

Adds the 180-line raw log as a regression fixture and a test pinning
the new behaviour.

Full suite: 1716 passed.
Observed on a real Spring Boot project using Google Cloud's private
Maven repo via `artifactregistry-maven-wagon`. Compile-failure logs
carry large amounts of non-actionable chatter that drove savings down
to 41.6% on a 259-line fixture. Adds four detectors:

- INFO_NOISE_PATTERNS:
  * `is present in the local repository, but cached` — dozens of
    ~300-char `[INFO]` lines explaining which remote repos each
    cached artifact would otherwise resolve from. Never needs action.
  * `Initializing Credentials`, `Application Default Credentials`,
    `Refreshing Credentials` — GCP auth lifecycle lines emitted by
    the wagon during download.

- is_mvn_startup_noise:
  * `JUL_LOG_HEADER_RE` — detects java.util.logging header format
    (`Apr 18, 2026 12:19:27 AM com.google.auth.oauth2.X warnY`) so
    that the next-line WARNING body has the right context.
  * `BARE_PLUGIN_WARNING_PREFIXES = ["WARNING: Your application has
    authenticated"]` — body of the GCP end-user-credentials warning.

Fixture: 259-line anonymized capture of the user's real failing build.
Savings: 41.6% -> 86.0% (2419 -> 339 tokens). All compile errors
preserved; BUILD FAILURE + Total time preserved.

Full suite: 1718 passed.
On every successful build, three maven plugins emit one-line
notifications that the user never acts on:

  - maven-enforcer-plugin: one `[INFO] Rule N: <fqcn> passed` line
    per enforced rule (typically RequireMavenVersion, RequireJavaVersion,
    RequirePluginVersions).
  - githook-maven-plugin: `[INFO] Installing commit-msg hook into
    <.git/hooks>` on every build.
  - maven-compiler-plugin: `[INFO] Changes detected - recompiling the
    module!` immediately before the real compile step.

Adds ENFORCER_RULE_PASSED_RE (regex, anchored and requires the trailing
` passed`) and substring patterns for the other two. Saves 3–8 lines on
a typical build. On the 259-line artifactregistry fixture, savings go
from 86.0% → 86.8% (2419 → 319 tokens, 64 lines). Compile errors and
`BUILD FAILURE` still preserved.

Full suite: 1719 passed.
Three large fixtures were padded with redundant noise that did not
strengthen any assertion — the filter strips a *pattern*, not a count.
Cut the repetitions while preserving every signal the assertions check:

- mvn_compile_artifactregistry.txt: 259 -> 107 (-58%)
- mvn_test_compile_failure.txt:     180 -> 52  (-71%)
- mvn_dep_tree_beacon.txt:          652 -> 281 (-57%)

Beacon snapshot regenerated (28 direct deps with intact transitive
counts). All three preserve their savings thresholds (>=80%, >=60%).

Renamed PR-tagged fixtures to behavior-based names so their purpose
stays clear once the PR number is forgotten:

- mvn_pr782_*_raw.txt        -> mvn_{compile,test}_reactor_*.txt
- mvn_pr1241_compile_pgp_*   -> mvn_compile_pgp_multimodule.txt
- mvn_pr1241_test_failure_*  -> mvn_test_failure_stack_isolation.txt

Test fn names updated to match. PR provenance moved into module-level
comments. cargo test --all: 1719 passed, 6 ignored.
…l review

Two HIGH findings from Codex adversarial review on PR rtk-ai#1089:

1) Reactor / `-pl <module>` runs lost failure details
   `enrich_with_reports` only probed `<cwd>/target/{surefire,failsafe}-reports`,
   so reactor builds and `mvn -pl <module>` from the repo root silently
   fell back to "no XML reports found" despite fresh per-module reports
   existing under `<module>/target/...`.

   Added `discover_report_dirs()` walking depth-1 module dirs (skipping
   `.*`, `target`, `src`, `node_modules`, `build`, `out`) and a
   `collect_reports()` merger that aggregates SurefireResults across
   every module that produced output.

2) Forked-VM crashes / surefire aborts reported as `0 passed`
   When the parser entered Testing but BUILD FAILURE arrived before a
   `Results:` block (forked-VM crash, surefire timeout, plugin abort),
   `cumulative` stayed at zero and the success branch emitted a
   synthetic `0 passed` summary — silently hiding a hard failure.

   Added a guard: when `BUILD FAILURE` is present and no failure was
   parsed, fall back to `filter_mvn_compile` so the actual error block
   surfaces to the user.

Tests:
- enrich_reactor_finds_per_module_reports — depth-1 module walk
- enrich_reactor_real_world_anliksim_multi_module — real fixture from
  anliksim/maven-multi-project-example (4-module reactor, module1 fails,
  module2 SKIPPED) using full pipeline (filter -> enrich)
- enrich_reactor_skips_dot_dirs_and_node_modules — walker boundary
- test_forked_vm_crash_never_emits_synthetic_pass — surefire booter
  exit-137 scenario

cargo test --all: 1723 passed, 6 ignored.
The reactor multi-module fixtures landed with personal paths
(`/home/mariusz/...`) and a project-specific package name
(`anliksim`). Anonymize and compact:

- anliksim -> com.example.app (package + groupId)
- /home/mariusz/projects/sample/...  -> /workspace/example-project/

Compaction (filter still asserts the same signal — class.method,
ComparisonFailure message, BUILD FAILURE, no-reports-hint absent):

- mvn_test_reactor_module_failure.txt:    130 -> 56  (-57%)
  Drops repeated WARNING boilerplate, "Loaded N auto-discovered
  prefixes" repeats, BoM "Copying" pom lines, "skip non existing
  resourceDirectory" trivia, and the 30-line JUnit reflection
  stack trace (5-frame essence kept).

- TEST-com.example.app.SpeakerTest.xml:   ~70 -> 13 (-81%)
- TEST-com.example.app.AppTest.xml:       ~67 -> 4  (-94%)
  Drops the surefire <properties> block (60+ properties full of
  absolute paths and personal info — the parser never reads them).

cargo test --all: 1723 passed, 6 ignored.
XML reports are authoritative (stack trace, captured output), so strip
the text-filter's `Failures:` block whenever surefire/failsafe XML has
failures. Fallback path (XML missing) still surfaces the text block so
users keep context from stdout parsing.
When XML is missing and the text parser is the only failure source,
its output now matches the XML-enriched format:

1. Shorten exception FQN on the first detail line
   (`org.junit.ComparisonFailure:` -> `ComparisonFailure:`) via
   `shorten_exception_header`, mirroring `failure_kind_label`'s
   `rsplit('.').next()` logic.

2. Drop non-app stack frames when pom groupId is known
   (`app_packages` threaded through `filter_mvn_tests_with_goal`).
   `org.junit.Assert.*`, `org.hamcrest.*`, and similar assertion-lib
   frames were leaking through the legacy whitelist — now gated by
   `is_framework_frame_ext` which also rejects any `at <pkg>...`
   frame not in `app_packages`.

Empty `app_packages` falls back to the legacy whitelist — no
regression for fixtures and callers that do not detect a groupId.
Upstream introduced src/cmds/jvm/ (gradlew). Consolidate the Maven filter
(mvn_cmd, pom_groupid, stack_trace, surefire_reports) under the same jvm/
umbrella so the module layout matches upstream.

- move src/cmds/java/* -> src/cmds/jvm/*
- crate::cmds::java:: -> crate::cmds::jvm::
- rename insta snapshots __java__ -> __jvm__ and fix source: headers
- drop `pub mod java;` (jvm/mod.rs automod already covers the dir)
- docs: src/cmds README, jvm README title + gradlew note, module map
@mariuszs mariuszs force-pushed the feat/mvn-rust-module branch from f79f814 to 6b203cb Compare May 29, 2026 12:41
The old prose described the pre-multi-goal Clap sub-enum with
external_subcommand routing, which no longer exists. Describe the actual
dispatch -> route_goal flow (0 goals passthrough, 1 goal dedicated
filter, >=2 goals multi-goal aggregator).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants