Skip to content

feat: [Dataplex Tool] Search Data Quality Scans#2444

Open
Andres-Ayala1 wants to merge 20 commits intogoogleapis:mainfrom
Andres-Ayala1:feat/dataplex-search-dq-scans
Open

feat: [Dataplex Tool] Search Data Quality Scans#2444
Andres-Ayala1 wants to merge 20 commits intogoogleapis:mainfrom
Andres-Ayala1:feat/dataplex-search-dq-scans

Conversation

@Andres-Ayala1
Copy link
Copy Markdown

@Andres-Ayala1 Andres-Ayala1 commented Feb 10, 2026

Description

Should include a concise description of the changes (bug or feature), it's
impact, along with a summary of the solution

New Tool: Added dataplex-search-dq-scans tool

Source Update: Enhanced dataplex source to support searching data quality scans.

Documentation: Created
docs/en/resources/tools/dataplex/dataplex-search-dq-scans.md

Integration: Added to prebuilt
dataplex.yaml and CLI registration.
Testing: Added full unit and integration tests in dataplex_integration_test.go

Impact: This feature enables users to programmatically search for and discover Dataplex Data Quality Scans directly from the Toolbox (CLI and MCP). It facilitates automated workflows for data quality monitoring and governance. The implementation includes parameters for filtering (by scan ID or table name), pagination, and ordering.

PR Checklist

Thank you for opening a Pull Request! Before submitting your PR, there are a
few things you can do to make sure it goes smoothly:

  • Make sure you reviewed
    CONTRIBUTING.md
  • Make sure to open an issue as a
    bug/issue
    before writing your code! That way we can discuss the change, evaluate
    designs, and agree on the general idea
  • Ensure the tests and linter pass
  • Code coverage does not decrease (if any source code was changed)
  • Appropriate docs were updated (if necessary)
  • Make sure to add ! if this involve a breaking change

🛠️ Fixes #2398

@Andres-Ayala1
Copy link
Copy Markdown
Author

/gemini review

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello @Andres-Ayala1, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the Dataplex integration by introducing a new capability to search for Data Quality Scans. This feature empowers users to automate data quality monitoring and governance workflows by providing programmatic access to discover and filter data quality scans based on various criteria. The implementation includes necessary updates to the Dataplex source, comprehensive documentation, and robust testing to ensure its stability and usability.

Highlights

  • New Dataplex Tool: Introduced a new dataplex-search-dq-scans tool, enabling users to search for Dataplex Data Quality Scans programmatically.
  • Dataplex Source Enhancement: The core Dataplex source was updated to support the new data quality scan search functionality, including the integration of a DataScanClient.
  • Comprehensive Documentation: Dedicated documentation for the dataplex-search-dq-scans tool was added, detailing its purpose, parameters, and IAM requirements.
  • Tool Integration: The new tool has been integrated into the prebuilt dataplex.yaml configuration and registered for CLI usage.
  • Extensive Testing: Full unit and integration tests were added to ensure the reliability and correct functionality of the new data quality scan search tool.
Changelog
  • cmd/root.go
    • Imported the new dataplexsearchdqscans package to register the tool.
  • cmd/root_test.go
    • Added search_dq_scans to the list of ToolNames for dataplex_tools in the prebuilt tools test.
  • docs/en/resources/tools/dataplex/dataplex-search-dq-scans.md
    • Added new documentation for the dataplex-search-dq-scans tool, including its description, compatible sources, parameters, IAM requirements, and an example.
  • internal/prebuiltconfigs/tools/dataplex.yaml
    • Defined the search_dq_scans tool with its type, source, and description.
    • Included search_dq_scans in the dataplex_tools toolset.
  • internal/sources/dataplex/dataplex.go
    • Modified the Initialize method to return both CatalogClient and DataScanClient.
    • Added a DataScanClient field to the Source struct.
    • Implemented the GetDataScanClient method to expose the DataScanClient.
    • Updated initDataplexConnection to initialize and return a new DataScanClient.
    • Added the SearchDataQualityScans method to perform data quality scan searches using the DataScanClient.
  • internal/tools/dataplex/dataplexsearchdqscans/dataplexsearchdqscans.go
    • Added a new Go file defining the dataplex-search-dq-scans tool.
    • Implemented the newConfig function for tool configuration parsing.
    • Defined the Config struct for tool parameters.
    • Implemented the Initialize method to set up tool parameters like filter, data_scan_id, table_name, pageSize, and orderBy.
    • Implemented the Invoke method to execute the data quality scan search using the Dataplex source.
  • internal/tools/dataplex/dataplexsearchdqscans/dataplexsearchdqscans_test.go
    • Added a new Go test file for the dataplex-search-dq-scans tool.
    • Included TestParseFromYamlDataplexSearchDQScans to verify correct YAML parsing for the tool's configuration.
  • tests/dataplex/dataplex_integration_test.go
    • Added DataplexSearchDataQualityScansToolType constant.
    • Implemented setupDataplexSearchDataQualityScan to create test data quality scans for integration tests.
    • Added initDataplexDataScanConnection to establish a connection to the Dataplex DataScan API.
    • Modified TestDataplexToolEndpoints to include setup, teardown, and invocation tests for the new data quality scan tool.
    • Updated setupBigQueryTable to include a schema definition for the created BigQuery table.
    • Extended getDataplexToolsConfig to include configurations for the new my-dataplex-search-dq-scans-tool and its authenticated version.
    • Added new test cases to runDataplexToolGetTest for the new data quality scan tool.
    • Implemented runDataplexSearchDataQualityScansToolInvokeTest to perform end-to-end invocation tests for the data quality scan search functionality, covering success and authorization failure scenarios.
Activity
  • The pull request introduces a new feature and includes full unit and integration tests to validate its functionality.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new dataplex-search-dq-scans tool, enhancing the Dataplex source to allow searching for data quality scans. The changes include updating the cmd/root.go and cmd/root_test.go files to register and test the new tool, adding comprehensive documentation in docs/en/resources/tools/dataplex/dataplex-search-dq-scans.md, and modifying the internal/prebuiltconfigs/tools/dataplex.yaml to include the new tool in the prebuilt configurations. The internal/sources/dataplex/dataplex.go file was updated to support the new DataScanClient and implement the SearchDataQualityScans method. Additionally, tests/dataplex/dataplex_integration_test.go was updated with new setup functions and integration tests for the new tool, ensuring its functionality and correct behavior under various conditions, including authorization and different filtering parameters. Overall, the changes are well-implemented and thoroughly tested.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new dataplex-search-dq-scans tool, enhancing the Dataplex integration by allowing users to search for data quality scans. The changes include adding the tool to the CLI, updating the Dataplex source to support data scan operations, creating comprehensive documentation, and implementing unit and integration tests. The overall structure and implementation align well with existing patterns in the repository. The new functionality is well-documented and tested, which is great for maintainability and reliability.

t.Fatalf("Failed to check dataset %q existence: %v", datasetName, err)
}
metadataToCreate := &bigqueryapi.DatasetMetadata{Name: datasetName}
metadataToCreate := &bigqueryapi.DatasetMetadata{Name: datasetName, Location: "us"}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The Location: "us" is hardcoded here. While this might be acceptable for integration tests, it's generally better practice to use a configurable location or derive it from the DataplexProject if possible, to make tests more robust across different environments or regions.

Suggested change
metadataToCreate := &bigqueryapi.DatasetMetadata{Name: datasetName, Location: "us"}
metadataToCreate := &bigqueryapi.DatasetMetadata{Name: datasetName, Location: "us-central1"}

@Andres-Ayala1 Andres-Ayala1 changed the title [Dataplex Tool] Search Data Quality Scans feat: [Dataplex Tool] Search Data Quality Scans Feb 17, 2026
@Andres-Ayala1
Copy link
Copy Markdown
Author

@duwenxin99 are you able to review or provide guidance on steps I should take for further testing of this new tool?

@Andres-Ayala1
Copy link
Copy Markdown
Author

example_gemini_cli_response Here was an output I got from this tool with a general search of "most recent data quality scans in my project"

@duwenxin99
Copy link
Copy Markdown
Contributor

/gcbrun

@duwenxin99 duwenxin99 added the tests: run Label to trigger Github Action tests. label Feb 19, 2026
@github-actions github-actions bot removed the tests: run Label to trigger Github Action tests. label Feb 19, 2026
@duwenxin99
Copy link
Copy Markdown
Contributor

duwenxin99 commented Feb 19, 2026

Hi @Andres-Ayala1 the integration tests are sufficient but they are failing. Could you make sure they are passing locally? Thanks! Let me know when you need them triggered again. You also need to run golangci-lint run to fi the lint errors.

Updated the Dataplex tool implementation to use 'data.resource' instead of 'data.entity' for BigQuery table filtering within Dataplex DataScan API
…ation test, and using correct contentkey to access API response
@Andres-Ayala1
Copy link
Copy Markdown
Author

hi @duwenxin99 i fixed the lint errors with golangci-lint and ran the integration tests locally which they are passing. Please run /gcbrun when you get the chance. Thanks! :)

@duwenxin99 duwenxin99 requested review from a team as code owners March 8, 2026 19:21
@duwenxin99
Copy link
Copy Markdown
Contributor

/gcbrun

@duwenxin99 duwenxin99 added the tests: run Label to trigger Github Action tests. label Mar 8, 2026
@github-actions github-actions bot removed the tests: run Label to trigger Github Action tests. label Mar 8, 2026
@Andres-Ayala1
Copy link
Copy Markdown
Author

@duwenxin99 hi, it seems on the latest run of /gcbrun the dataplex source integration test was cancelled and an unrelated source where I made no changes (cockroachdb) encountered a failure.

Are there breaking changes that have been merged since my last commit that would affect my PR?

@duwenxin99
Copy link
Copy Markdown
Contributor

@Andres-Ayala1, there was some integration test failures last week, and they are fixed now in the latest main. Could you rebase please? I'll re-trigger the test.

@duwenxin99 duwenxin99 added status: waiting for response Status: reviewer is awaiting feedback or responses from the author before proceeding. priority: p2 Moderately-important priority. Fix may not be included in next release. labels Mar 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

priority: p2 Moderately-important priority. Fix may not be included in next release. status: waiting for response Status: reviewer is awaiting feedback or responses from the author before proceeding.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Dataplex Data Quality Search DQ Scans Tool

2 participants