Skip to content

add new field updatedRows to QueryStatistics.java where it's availabl… #24810

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

bhzaeri
Copy link

@bhzaeri bhzaeri commented Jan 27, 2025

…e in EventListener callbacks. It comes from TableMutationOperator.java where update/delete queries are issued.

add method recordUpdatedPositions to the OperatorContext class update the updatedPositions for the queries run through MergeWriterOperator.java

Description

The issue that this pull request tries to fix is that after running UPDATE/DELETE queries, the outputRows in QueryStatistics.java is always 1. But we need the actual value of updated rows. We figured out that the number of updated rows is returned in TableMutationOperator.java method getOutput() and also in MergeWriterOperator.java the same method. The updated rows number is passed to operatorContext instance and from there, all the way down to the QueryStatistics.java which make the number available in the event listeners.
So far, we have tested this successfully on SQL Serevr, Hive, Mysql, and Postgresql.

Additional context and related issues

In our ransomware defender platform, we need to monitor the behavior of users who have access to run queries on the DBs connected via Trino. So, we need to be notified of users' actions and the exact results of their actions. Trino returns the correct values for SELECT and INSERT queries. We need the same for UPDATE/DELTE as well.

The only issue is that the current pull request doesn't include the proper unit tests for the changes. Could you please help us with the proper way to write the unit tests?

Release notes

( ) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
(x) Release notes are required, with the following suggested text:

## Section
* Fix some things. ({issue}`24596`)

#24596

Copy link

cla-bot bot commented Jan 27, 2025

Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to [email protected]. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla

@bhzaeri
Copy link
Author

bhzaeri commented Jan 27, 2025

Hi @chenjian2664 @wendigo
I added updating of the rowCount in the MergeWriterOperator.java as well as TableMutationOperator.java, as you advised in the last pull request. I tested the current branch successfully with Hive, SQLServer, Mysql, and Postgresql.
Could you please take a look when you get the chance?
Bahram

@bhzaeri bhzaeri force-pushed the event-listener-updated-rows-number-4 branch from 336a5e4 to 01467dd Compare January 27, 2025 18:36
Copy link

cla-bot bot commented Jan 27, 2025

Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to [email protected]. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla

Copy link
Contributor

@chenjian2664 chenjian2664 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to add a test for it?
Please rephrase commit https://github.com/trinodb/trino/blob/master/.github/DEVELOPMENT.md

@bhzaeri
Copy link
Author

bhzaeri commented Jan 28, 2025

That's actually my next question.
Is there a similar unit test to take a look and get the idea on how to implement the unit test for this?
Thanks!

@bhzaeri bhzaeri force-pushed the event-listener-updated-rows-number-4 branch from 01467dd to 9fe872e Compare January 28, 2025 14:41
@cla-bot cla-bot bot added the cla-signed label Jan 28, 2025
@bhzaeri bhzaeri force-pushed the event-listener-updated-rows-number-4 branch from 9fe872e to 2555e53 Compare January 31, 2025 20:46
@github-actions github-actions bot added hive Hive connector cassandra Cassandra connector redshift Redshift connector kudu labels Jan 31, 2025
@bhzaeri bhzaeri force-pushed the event-listener-updated-rows-number-4 branch from 5755355 to de4be2d Compare February 3, 2025 03:32
@github-actions github-actions bot added iceberg Iceberg connector delta-lake Delta Lake connector loki Loki connector redis Redis connector labels Feb 3, 2025
@bhzaeri
Copy link
Author

bhzaeri commented Feb 3, 2025

Hi @chenjian2664 @wendigo
I have added unit tests for the updatd rows. Since the connector in the event listeners test did not support update/delete, I had to add the new tests for update/delete to the BaseConnectorTest, inherited by all the connector tests for different DBs.
Please take a look and let me know if there are any issues.
Thanks!
Bahram

@bhzaeri bhzaeri requested a review from chenjian2664 February 3, 2025 17:59
@github-actions github-actions bot added the faker Faker connector label Feb 3, 2025
@bhzaeri bhzaeri force-pushed the event-listener-updated-rows-number-4 branch 2 times, most recently from de4be2d to 1948ead Compare February 3, 2025 18:50
Copy link
Contributor

@chenjian2664 chenjian2664 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @Praveen2112
For the EventListener, I think you may have some thoughts

import java.util.concurrent.atomic.AtomicBoolean;

@ThreadSafe
public class EventsCollector
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There already exists a EventsCollector, why not reuse it?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I explained it in the previous commit message:
The test (for UPDATE/DELETE queries) was added to BaseConnectorTest.java and it needs the classes like EventsAwaitingQueries and QueryEvents that exist in trino-tests module. So, I moved the needed classes from trino-tests to trino-testing where they are exposed to both trino-tests and trino-testing.
I removed those classes from trino-tests and added to trino-testing so both the old and the new tests can have access to them.

import static java.util.concurrent.TimeUnit.MILLISECONDS;

@ThreadSafe
public class QueryEvents
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same, why not reuse it

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please refer to the last review.

@@ -332,6 +332,7 @@ private static QueryStats immediateFailureQueryStats()
DataSize.ofBytes(0),
0,
0,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the commits head wrap to 50 characters, body to 72 characters https://github.com/trinodb/trino/blob/master/.github/DEVELOPMENT.md.

@bhzaeri bhzaeri force-pushed the event-listener-updated-rows-number-4 branch from 1948ead to 19f5709 Compare February 4, 2025 16:05
@bhzaeri bhzaeri requested a review from Praveen2112 February 4, 2025 16:11
@bhzaeri bhzaeri force-pushed the event-listener-updated-rows-number-4 branch 2 times, most recently from 23f1604 to aa5f1fa Compare February 4, 2025 20:52
@bhzaeri bhzaeri requested a review from chenjian2664 February 7, 2025 17:12
@bhzaeri bhzaeri force-pushed the event-listener-updated-rows-number-4 branch 3 times, most recently from 1e2a07c to 4ddc159 Compare February 18, 2025 16:36
@bhzaeri
Copy link
Author

bhzaeri commented Feb 18, 2025

Hi @Praveen2112 @chenjian2664 @wendigo
Any thoughts?

@bhzaeri bhzaeri self-assigned this Feb 18, 2025
@bhzaeri bhzaeri force-pushed the event-listener-updated-rows-number-4 branch from 4ddc159 to 3b77122 Compare February 24, 2025 18:32
@bhzaeri bhzaeri force-pushed the event-listener-updated-rows-number-4 branch 3 times, most recently from baa4f5e to 37ed58a Compare March 10, 2025 15:58
@bhzaeri bhzaeri force-pushed the event-listener-updated-rows-number-4 branch 2 times, most recently from fe310d5 to 6051153 Compare March 13, 2025 13:57
@bhzaeri bhzaeri force-pushed the event-listener-updated-rows-number-4 branch 2 times, most recently from a6509ed to 193b4aa Compare March 19, 2025 17:02
@bhzaeri bhzaeri force-pushed the event-listener-updated-rows-number-4 branch 6 times, most recently from 4cefaad to 2e1e751 Compare April 14, 2025 15:44
Come from TableMutationOperator and MergeWriteOperator's rowCount
@bhzaeri bhzaeri force-pushed the event-listener-updated-rows-number-4 branch from 2e1e751 to eb69b6f Compare April 15, 2025 14:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cassandra Cassandra connector cla-signed delta-lake Delta Lake connector faker Faker connector hive Hive connector iceberg Iceberg connector kudu loki Loki connector redis Redis connector redshift Redshift connector
Development

Successfully merging this pull request may close these issues.

2 participants