Skip to content

Conversation

@wuxueyang96
Copy link
Contributor

@wuxueyang96 wuxueyang96 commented May 12, 2025

What type of PR is this:

  • BugFix
  • Feature
  • Enhancement
  • Refactor
  • UT
  • Doc
  • Tool

Which issues of this PR fixes :

Fixes #

Problem Summary(Required) :

Currently, we have a StarRocksDynamicLookupFunction which collects all of the data on StarRocks and caches the data until it expires. It is good at handling small-scale data, but bad at handling large-scale data.

The pr rename StarRocksDynamicLookupFunction to StarRocksDynamicCachedLookupFunction, and add a new StarRocksDynamicLookupFunction to execute the query directly.

Checklist:

  • I have added test cases for my bug fix or my new feature
  • This pr will affect users' behaviors
  • This pr needs user documentation (for new or modified features or behaviors)
  • I have added documentation for my new feature or new function

Signed-off-by: wuxueyang.wxy <[email protected]>
@jaogoy jaogoy requested review from banmoy and Copilot and removed request for banmoy July 28, 2025 09:50
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces a new lookup execution strategy that allows users to choose between cached and direct database query approaches for StarRocks lookups. The cached approach is suitable for small datasets, while the direct query approach better handles large-scale data.

  • Adds a new configuration option lookup.cache.enabled to control lookup behavior
  • Introduces a new StarRocksDynamicLookupFunction for direct JDBC queries
  • Renames the existing cached lookup implementation to StarRocksDynamicCachedLookupFunction

Reviewed Changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
StarRocksSourceOptions.java Adds LOOKUP_ENABLE_CACHE config option and getter method
StarRocksDynamicTableSourceFactory.java Registers the new cache enable option
StarRocksDynamicTableSource.java Implements factory method to choose between cached/direct lookup functions
StarRocksDynamicLookupFunction.java Complete rewrite to implement direct JDBC-based lookup functionality
StarRocksDynamicCachedLookupFunction.java New file containing the original cached lookup implementation
Comments suppressed due to low confidence (1)

public Collection<RowData> lookup(RowData rowData) {
reloadData();
Row keyRow = genRow(rowData);
return cacheMap.get(keyRow);
Copy link

Copilot AI Jul 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The method returns null when no matching data is found, but the return type is Collection<RowData>. This could cause NullPointerException in calling code. Should return Collections.emptyList() instead.

Suggested change
return cacheMap.get(keyRow);
return cacheMap.getOrDefault(keyRow, Collections.emptyList());

Copilot uses AI. Check for mistakes.
Comment on lines +88 to +90
GenericRowData gRowData = (GenericRowData) rowData;
Object[] keyObj = new Object[filterRichInfos.length];
for (int i = 0; i < filterRichInfos.length; i ++) {
Copy link

Copilot AI Jul 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The method assumes the input RowData is a GenericRowData without type checking, which could cause ClassCastException. Should add type checking or handle different RowData implementations.

Suggested change
GenericRowData gRowData = (GenericRowData) rowData;
Object[] keyObj = new Object[filterRichInfos.length];
for (int i = 0; i < filterRichInfos.length; i ++) {
if (!(rowData instanceof GenericRowData)) {
LOG.error("Expected GenericRowData but received: {}", rowData.getClass().getName());
throw new IllegalArgumentException("Invalid RowData type. Expected GenericRowData.");
}
GenericRowData gRowData = (GenericRowData) rowData;
Object[] keyObj = new Object[filterRichInfos.length];
for (int i = 0; i < filterRichInfos.length; i++) {

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant