Skip to content

Conversation

@VGalaxies
Copy link
Contributor

As title.

@VGalaxies VGalaxies changed the title Pipe: Support pattern pruning and redundancy removal in TreePattern parsing Pipe: support pattern pruning and redundancy removal in TreePattern parsing Nov 24, 2025
@VGalaxies VGalaxies requested a review from Copilot November 24, 2025 07:43
Copilot finished reviewing on behalf of VGalaxies November 24, 2025 07:46
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR enhances the TreePattern parsing logic to support pattern pruning and redundancy removal. The changes optimize pattern lists by removing redundant patterns (e.g., if "root." exists, "root.db." is removed) and pruning inclusion patterns that are fully covered by exclusion patterns. This improves configuration validation and prevents the creation of pipe patterns that would match nothing.

Key Changes

  • Refactored pattern parsing to work with pattern lists before building union patterns
  • Added optimizePatterns() to remove redundant patterns within inclusion/exclusion lists
  • Added pruneInclusionPatterns() to remove inclusion patterns fully covered by exclusion patterns
  • Added early validation to throw a descriptive error when all inclusion patterns are pruned
  • Optimized buildUnionPattern() to return the pattern directly for single-element lists

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +423 to +427
} catch (final Exception e) {
// In case of path parsing errors or unsupported operations, assume no coverage
// to be safe and avoid aggressive pruning.
return false;
}
Copy link

Copilot AI Nov 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The covers method silently catches all exceptions and returns false without logging. While the comment explains this is to "be safe and avoid aggressive pruning", this could hide legitimate errors such as IllegalPathException or NullPointerException. Consider logging the exception at DEBUG or TRACE level to aid in debugging, or at least being more specific about which exceptions are expected (e.g., catch only IllegalPathException).

Copilot uses AI. Check for mistakes.
Comment on lines +398 to 428
/** Checks if 'coverer' pattern fully covers 'coveree' pattern. */
private static boolean covers(final TreePattern coverer, final TreePattern coveree) {
try {
final List<PartialPath> covererPaths = coverer.getBaseInclusionPaths();
final List<PartialPath> covereePaths = coveree.getBaseInclusionPaths();

if (covererPaths.isEmpty() || covereePaths.isEmpty()) {
return false;
}

// Logic: For 'coverer' to cover 'coveree', ALL paths in 'coveree' must be included
// by at least one path in 'coverer'.
for (final PartialPath sub : covereePaths) {
boolean isSubCovered = false;
for (final PartialPath sup : covererPaths) {
if (sup.include(sub)) {
isSubCovered = true;
break;
}
}
if (!isSubCovered) {
return false;
}
}
return true;
} catch (final Exception e) {
// In case of path parsing errors or unsupported operations, assume no coverage
// to be safe and avoid aggressive pruning.
return false;
}
}
Copy link

Copilot AI Nov 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new methods optimizePatterns, pruneInclusionPatterns, and covers lack documentation. While optimizePatterns has a brief comment explaining its purpose, the other two methods are completely undocumented. Consider adding JavaDoc comments that explain:

  • The purpose and algorithm of each method
  • Parameter descriptions and constraints
  • Return value specifications
  • Edge cases and examples

For instance, covers should document what "covers" means in terms of path matching semantics.

Copilot uses AI. Check for mistakes.
Comment on lines +369 to +396
/**
* Prunes patterns from the inclusion list that are fully covered by ANY pattern in the exclusion
* list.
*/
private static List<TreePattern> pruneInclusionPatterns(
final List<TreePattern> inclusion, final List<TreePattern> exclusion) {
if (inclusion == null || inclusion.isEmpty()) {
return new ArrayList<>();
}
if (exclusion == null || exclusion.isEmpty()) {
return inclusion;
}

// 4. If neither "source.path" nor "source.pattern" is specified,
// return the provided default pattern (which may be null).
return defaultPattern;
final List<TreePattern> prunedInclusion = new ArrayList<>();
for (final TreePattern inc : inclusion) {
boolean isFullyExcluded = false;
for (final TreePattern exc : exclusion) {
if (covers(exc, inc)) {
isFullyExcluded = true;
break;
}
}
if (!isFullyExcluded) {
prunedInclusion.add(inc);
}
}
return prunedInclusion;
}
Copy link

Copilot AI Nov 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documentation for pruneInclusionPatterns is incomplete. While the method has a one-line comment, it should have proper JavaDoc documentation explaining:

  • What "fully covered" means in this context
  • Examples of when patterns would be pruned
  • Edge case behavior (e.g., empty lists)
  • The relationship with the covers helper method

Copilot uses AI. Check for mistakes.
Comment on lines +176 to +185
// 4. Prune inclusion patterns: if an inclusion pattern is fully covered by an exclusion
// pattern, remove it
inclusionPatterns = pruneInclusionPatterns(inclusionPatterns, exclusionPatterns);

// 5. Check if the resulting inclusion pattern is empty
if (inclusionPatterns.isEmpty()) {
throw new PipeException(
"Pipe: The inclusion pattern is empty after pruning by the exclusion pattern. "
+ "This pipe pattern will match nothing.");
}
Copy link

Copilot AI Nov 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The pruning logic at this stage may result in redundant checks with the existing checkAndLogPatternCoverage method called in WithExclusionIoTDBTreePattern and WithExclusionTreePattern constructors (lines 201-204, 207-208). Both perform coverage checks, but at different granularities:

  • The new pruning checks individual patterns in lists before building unions
  • The existing check validates final TreePattern objects after construction

While this provides defense-in-depth, it could lead to different error messages for the same underlying issue. Consider documenting this relationship or consolidating the checks to avoid confusion.

Copilot uses AI. Check for mistakes.
* UnionTreePattern.
*/
private static TreePattern buildUnionPattern(
final boolean isTreeModelDataAllowedToBeCaptured, final List<TreePattern> patterns) {
Copy link

Copilot AI Nov 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The buildUnionPattern method now has an early return for single-element lists, but it doesn't validate that the list is non-empty. If an empty list is passed, it will throw an IndexOutOfBoundsException when trying to access patterns.get(0). Consider adding a null/empty check before this optimization.

Suggested change
final boolean isTreeModelDataAllowedToBeCaptured, final List<TreePattern> patterns) {
final boolean isTreeModelDataAllowedToBeCaptured, final List<TreePattern> patterns) {
if (patterns == null || patterns.isEmpty()) {
throw new IllegalArgumentException("patterns list must not be null or empty");
}

Copilot uses AI. Check for mistakes.
Comment on lines +324 to +367
private static List<TreePattern> optimizePatterns(final List<TreePattern> patterns) {
if (patterns == null || patterns.isEmpty()) {
return new ArrayList<>();
}
if (patterns.size() == 1) {
return patterns;
}

// 3. If only "source.pattern" is specified, parse it using the helper method.
if (pattern != null) {
return buildUnionPattern(
isTreeModelDataAllowedToBeCaptured,
parsePatternsFromPatternParameter(
pattern, sourceParameters, isTreeModelDataAllowedToBeCaptured));
final List<TreePattern> optimized = new ArrayList<>();
// Determine coverage using base paths
for (int i = 0; i < patterns.size(); i++) {
final TreePattern current = patterns.get(i);
boolean isCoveredByOther = false;

for (int j = 0; j < patterns.size(); j++) {
if (i == j) {
continue;
}
final TreePattern other = patterns.get(j);

// If 'other' covers 'current', 'current' is redundant.
// Note: if they mutually cover each other (duplicates), we must ensure we keep one.
// We use index comparison to break ties for exact duplicates.
if (covers(other, current)) {
if (covers(current, other)) {
// Both cover each other (likely identical). Keep the one with smaller index.
if (j < i) {
isCoveredByOther = true;
break;
}
} else {
// Strict coverage
isCoveredByOther = true;
break;
}
}
}

if (!isCoveredByOther) {
optimized.add(current);
}
}
return optimized;
}
Copy link

Copilot AI Nov 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The optimizePatterns method has O(n²) complexity where each pattern is compared against all others, and each comparison involves calling covers() which iterates over paths. While this is acceptable for small pattern lists (typically a handful), consider documenting the expected maximum size of pattern lists in the method documentation to set appropriate expectations.

Copilot uses AI. Check for mistakes.
@VGalaxies VGalaxies marked this pull request as ready for review November 27, 2025 15:54
Copy link
Collaborator

@Caideyipi Caideyipi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May add some tests

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants