feat: [Dataplex Tool] Search Data Quality Scans#2444
feat: [Dataplex Tool] Search Data Quality Scans#2444Andres-Ayala1 wants to merge 20 commits intogoogleapis:mainfrom
Conversation
|
/gemini review |
Summary of ChangesHello @Andres-Ayala1, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the Dataplex integration by introducing a new capability to search for Data Quality Scans. This feature empowers users to automate data quality monitoring and governance workflows by providing programmatic access to discover and filter data quality scans based on various criteria. The implementation includes necessary updates to the Dataplex source, comprehensive documentation, and robust testing to ensure its stability and usability. Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces a new dataplex-search-dq-scans tool, enhancing the Dataplex source to allow searching for data quality scans. The changes include updating the cmd/root.go and cmd/root_test.go files to register and test the new tool, adding comprehensive documentation in docs/en/resources/tools/dataplex/dataplex-search-dq-scans.md, and modifying the internal/prebuiltconfigs/tools/dataplex.yaml to include the new tool in the prebuilt configurations. The internal/sources/dataplex/dataplex.go file was updated to support the new DataScanClient and implement the SearchDataQualityScans method. Additionally, tests/dataplex/dataplex_integration_test.go was updated with new setup functions and integration tests for the new tool, ensuring its functionality and correct behavior under various conditions, including authorization and different filtering parameters. Overall, the changes are well-implemented and thoroughly tested.
internal/tools/dataplex/dataplexsearchdqscans/dataplexsearchdqscans.go
Outdated
Show resolved
Hide resolved
internal/tools/dataplex/dataplexsearchdqscans/dataplexsearchdqscans.go
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Code Review
This pull request introduces a new dataplex-search-dq-scans tool, enhancing the Dataplex integration by allowing users to search for data quality scans. The changes include adding the tool to the CLI, updating the Dataplex source to support data scan operations, creating comprehensive documentation, and implementing unit and integration tests. The overall structure and implementation align well with existing patterns in the repository. The new functionality is well-documented and tested, which is great for maintainability and reliability.
internal/tools/dataplex/dataplexsearchdqscans/dataplexsearchdqscans.go
Outdated
Show resolved
Hide resolved
| t.Fatalf("Failed to check dataset %q existence: %v", datasetName, err) | ||
| } | ||
| metadataToCreate := &bigqueryapi.DatasetMetadata{Name: datasetName} | ||
| metadataToCreate := &bigqueryapi.DatasetMetadata{Name: datasetName, Location: "us"} |
There was a problem hiding this comment.
The Location: "us" is hardcoded here. While this might be acceptable for integration tests, it's generally better practice to use a configurable location or derive it from the DataplexProject if possible, to make tests more robust across different environments or regions.
| metadataToCreate := &bigqueryapi.DatasetMetadata{Name: datasetName, Location: "us"} | |
| metadataToCreate := &bigqueryapi.DatasetMetadata{Name: datasetName, Location: "us-central1"} |
…er to maintain consistency + more explicit error msg
…le_test for search_dq_Scans
|
@duwenxin99 are you able to review or provide guidance on steps I should take for further testing of this new tool? |
|
/gcbrun |
|
Hi @Andres-Ayala1 the integration tests are sufficient but they are failing. Could you make sure they are passing locally? Thanks! Let me know when you need them triggered again. You also need to run |
Updated the Dataplex tool implementation to use 'data.resource' instead of 'data.entity' for BigQuery table filtering within Dataplex DataScan API
…ation test, and using correct contentkey to access API response
|
hi @duwenxin99 i fixed the lint errors with |
|
/gcbrun |
|
@duwenxin99 hi, it seems on the latest run of /gcbrun the dataplex source integration test was cancelled and an unrelated source where I made no changes (cockroachdb) encountered a failure. Are there breaking changes that have been merged since my last commit that would affect my PR? |
|
@Andres-Ayala1, there was some integration test failures last week, and they are fixed now in the latest main. Could you rebase please? I'll re-trigger the test. |

Description
New Tool: Added dataplex-search-dq-scans tool
Source Update: Enhanced dataplex source to support searching data quality scans.
Documentation: Created
docs/en/resources/tools/dataplex/dataplex-search-dq-scans.md
Integration: Added to prebuilt
dataplex.yaml and CLI registration.
Testing: Added full unit and integration tests in dataplex_integration_test.go
Impact: This feature enables users to programmatically search for and discover Dataplex Data Quality Scans directly from the Toolbox (CLI and MCP). It facilitates automated workflows for data quality monitoring and governance. The implementation includes parameters for filtering (by scan ID or table name), pagination, and ordering.
PR Checklist
CONTRIBUTING.md
bug/issue
before writing your code! That way we can discuss the change, evaluate
designs, and agree on the general idea
!if this involve a breaking change🛠️ Fixes #2398