Skip to content

Add support for TopN and aggregation pushdown in Elasticsearch #23118

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

murthy-chelankuri
Copy link
Member

Description

This pull request adds support for pushing down TopN and Aggregation to Elasticsearch.

Additional context and related issues

Opening a new merge request because the previous one was automatically closed for being stale, and we couldn't reopen the stale pull request.

#16919

Release notes

( ) This is not user-visible or is docs only, and no release notes are required.
(x) Release notes are required. Please propose a release note for me.
( ) Release notes are required, with the following suggested text:

# Fixes
* Support for order by pushdown in elasticsearch connector #12381 
* Elasticsearch connector aggregation push down support #7026 
* Elasticsearch connector TopN pushdown #4803 

Copy link

cla-bot bot commented Aug 23, 2024

Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: Murthy Chelankuri.
This is most likely caused by a git client misconfiguration; please make sure to:

  1. check if your git client is configured with an email to sign commits git config --list | grep email
  2. If not, set it up using git config --global user.email [email protected]
  3. Make sure that the git commit email is configured in your GitHub account settings, see https://github.com/settings/emails

@cla-bot cla-bot bot removed the cla-signed label Aug 23, 2024
Copy link

cla-bot bot commented Aug 23, 2024

Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: Murthy Chelankuri.
This is most likely caused by a git client misconfiguration; please make sure to:

  1. check if your git client is configured with an email to sign commits git config --list | grep email
  2. If not, set it up using git config --global user.email [email protected]
  3. Make sure that the git commit email is configured in your GitHub account settings, see https://github.com/settings/emails

@murthy-chelankuri murthy-chelankuri force-pushed the es-pushdown-topn-aggregation1 branch from f75d92c to 2935da8 Compare August 23, 2024 18:40
@cla-bot cla-bot bot added the cla-signed label Aug 23, 2024
@murthy-chelankuri
Copy link
Member Author

@martint @kokosing @hashhar Can you please review this MR at the earliest possible.

@ebyhr ebyhr changed the title Add support for TopN and aggregation pushdown Add support for TopN and aggregation pushdown in Elasticsearch Aug 24, 2024
@Praveen2112
Copy link
Member

Can we split the TopN and aggregation pushdown as a separate commit, this would make the reviews a bit easier

Comment on lines 62 to 37
public TopN addSortItem(TopNSortItem sortItem)
{
topNSortItems.add(sortItem);
return this;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is always a bit recommended to make the object TopN immutable instead of changing its state here

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Praveen2112 for the feed back. removed this method and made TopN as immutable.

Comment on lines 88 to 104
SUPPORTS_COMMENT_ON_COLUMN,
SUPPORTS_COMMENT_ON_TABLE,
SUPPORTS_CREATE_MATERIALIZED_VIEW,
SUPPORTS_CREATE_SCHEMA,
SUPPORTS_CREATE_TABLE,
SUPPORTS_CREATE_VIEW,
SUPPORTS_DELETE,
SUPPORTS_INSERT,
SUPPORTS_MERGE,
SUPPORTS_RENAME_COLUMN,
SUPPORTS_RENAME_TABLE,
SUPPORTS_ROW_TYPE,
SUPPORTS_SET_COLUMN_TYPE,
SUPPORTS_UPDATE -> false;
case SUPPORTS_LIMIT_PUSHDOWN,
SUPPORTS_TOPN_PUSHDOWN,
SUPPORTS_AGGREGATION_PUSHDOWN -> true;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we revert back the indentations ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Praveen2112 for the feedback reverted the indentations.

Comment on lines 1076 to 1079
"SELECT " +
"text_column " +
"FROM " + indexName + " " +
"WHERE text_column LIKE 's_.m%ex\\t'"))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we revert the indentations ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not see any issue with indentation on my editor. Just compared with the previous version i don't see any change in the indentation. Can you please check once ?

OptionalLong limit)
List<TermAggregation> termAggregations,
List<MetricAggregation> metricAggregations,
Optional<TopN> topN)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of maintaining TopN which maintains limit as well - Can we have an Optional<List<SortOrder>> if the list is present we could consider it as TopN else Limit - But we need to handle what if TopN -> Limit and otherway around

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Praveen2112 , can we keep the topN instead of splitting two arguments Optional<List> and limit? But if that is what we need to do, we can look into changing accordingly.

@hashhar hashhar removed their request for review September 27, 2024 20:49
Copy link

This pull request has gone a while without any activity. Tagging the Trino developer relations team: @bitsondatadev @colebow @mosabua

@github-actions github-actions bot added the stale label Oct 22, 2024
Copy link

Closing this pull request, as it has been stale for six weeks. Feel free to re-open at any time.

@murthy-chelankuri murthy-chelankuri force-pushed the es-pushdown-topn-aggregation1 branch from c85c5eb to 02c2107 Compare December 14, 2024 17:53
@github-actions github-actions bot removed the stale label Dec 16, 2024
Copy link

github-actions bot commented Jan 6, 2025

This pull request has gone a while without any activity. Tagging for triage help: @mosabua

@github-actions github-actions bot added the stale label Jan 6, 2025
@mosabua mosabua added stale-ignore Use this label on PRs that should be ignored by the stale bot so they are not flagged or closed. and removed stale labels Jan 10, 2025
@mosabua
Copy link
Member

mosabua commented Jan 10, 2025

Added stale-ignore labels since I think @murthy-chelankuri is wanting to continue work on this .. do you know what your next steps are @murthy-chelankuri or do you need help from reviewers?

@github-actions github-actions bot added release-notes docs ui Web UI jdbc Relates to Trino JDBC driver hudi Hudi connector iceberg Iceberg connector delta-lake Delta Lake connector hive Hive connector bigquery BigQuery connector mongodb MongoDB connector labels Jan 16, 2025
@mosabua
Copy link
Member

mosabua commented Jan 16, 2025

You need to rebase this PR @murthy-chelankuri to proceed .. it is currently not reviewable

@murthy-chelankuri murthy-chelankuri force-pushed the es-pushdown-topn-aggregation1 branch from 13032b5 to 02c2107 Compare January 16, 2025 19:19
@cocobeach1911
Copy link

Thank you @murthy-chelankuri for building this!

Any more updates on this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bigquery BigQuery connector cla-signed delta-lake Delta Lake connector docs hive Hive connector hudi Hudi connector iceberg Iceberg connector jdbc Relates to Trino JDBC driver mongodb MongoDB connector release-notes stale-ignore Use this label on PRs that should be ignored by the stale bot so they are not flagged or closed. ui Web UI
Development

Successfully merging this pull request may close these issues.

4 participants