Skip to content

[ci/test] Add index-benchmark Maven reactor + migrate util:expand perf-ratio tests to JMH #6387

@joewiz

Description

@joewiz

[This issue was filed in response to @duncdrum's review on #6318.]

Context

PR #6318 ([optimize] util:expand: cache Lucene terms per query, skip empty-termMap scans) initially included two surefire-based performance assertions:

  • singleHitHighlightingNotOrderOfMagnitudeSlowerutil:expand on one hit with highlighting on must be < 20× slower than highlighting off
  • batchWildcardExpandRatioUnderThreshold — batch util:expand($hits) must be < 15× slower than the optimised baseline

@duncdrum's review correctly observed that these don't belong in surefire — performance ratios are inherently flaky on shared CI runners (CPU variance, GC pauses, neighbour interference), and a hard threshold guarantees either false positives or a no-op test. They belong in a JMH benchmark module.

exist-core already has a benchmark reactor (exist-core-jmh/) for that exact purpose. The extensions/indexes/ tree does not.

Scope

  1. Add a sibling extensions/indexes/lucene/lucene-benchmark/ (or shared extensions/indexes-benchmark/) Maven module mirroring exist-core-jmh/ — JMH configured, @Benchmark discovery, parameterised with a Lucene-indexed corpus fixture.
  2. Migrate the two perf-ratio tests from the deleted UtilExpandHighlightingPerformanceTest.java (see #6318 history) into @Benchmark methods:
    • expandSingleHitHighlightingOn vs …HighlightingOff — ratio reported by JMH, no hard threshold
    • expandBatchWildcard — same shape, batch over the wildcard-matched hits
  3. Wire the new module into the existing !concurrency-stress-tests,!micro-benchmarks skip-by-default profile so it doesn't run on every PR; expose via -Pmicro-benchmarks or equivalent.
  4. Document the expected baseline ratios in the benchmark class comment (the PR achieved ~4.6× speedup on batch, ~no change on single-hit which was already fast).

Why this matters

  • Lock in a regression guard for the LuceneMatchListener term-rewrite cache so future LuceneMatchListener changes don't silently undo the speedup
  • Discoverable for any future contributor optimising the highlighting path
  • Matches the pattern already established for exist-core perf work

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions