-
Notifications
You must be signed in to change notification settings - Fork 1.4k
feat: [Dataplex Tool] Search Data Quality Scans #2444
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
Andres-Ayala1
wants to merge
20
commits into
googleapis:main
Choose a base branch
from
Andres-Ayala1:feat/dataplex-search-dq-scans
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from 7 commits
Commits
Show all changes
20 commits
Select commit
Hold shift + click to select a range
758817d
feat: add DataScan API to dataplex source
Andres-Ayala1 ae34cb6
feat:add search-dq-scans tool to prebuilt configs
Andres-Ayala1 6f78e51
test: test for dataplex search dq scans tool
Andres-Ayala1 d33cc22
feat: dataplexsearchdqscans tool
Andres-Ayala1 4429ed9
docs: dataplex-search-dq-scans docs
Andres-Ayala1 37e2c3e
test: add new tool searchdqscans to TestPrebuiltTools
Andres-Ayala1 2ebf9d0
test: add search dq scans invoke and get integration tests
Andres-Ayala1 febac4c
chore: remove comments
Andres-Ayala1 1416018
fix: replace parameter for Search DataQualityScans from query to filt…
Andres-Ayala1 4821398
Merge branch 'main' into feat/dataplex-search-dq-scans
Andres-Ayala1 72fd260
feat: change prebuilt tools check to new cmd/internal/skills tools_fi…
Andres-Ayala1 9b7d961
docs: parameters for search-dq-scans tool data scan id and table name
Andres-Ayala1 6dd9ee5
fix: util errors for compilation to pass
Andres-Ayala1 1f9713f
chore: license year headers
Andres-Ayala1 b899ae2
fix: registration of searchdqscans tool
Andres-Ayala1 074f968
Merge branch 'main' into feat/dataplex-search-dq-scans
duwenxin99 462a376
chore: lint formatting
Andres-Ayala1 1d527af
fix: correct table filter for data quality scans
Andres-Ayala1 671480d
fix: passing full resource id of table and dataset properly to integr…
Andres-Ayala1 310eb1b
Merge branch 'main' into feat/dataplex-search-dq-scans
duwenxin99 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
64 changes: 64 additions & 0 deletions
64
docs/en/resources/tools/dataplex/dataplex-search-dq-scans.md
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,64 @@ | ||
| --- | ||
| title: "dataplex-search-dq-scans" | ||
| type: docs | ||
| weight: 1 | ||
| description: > | ||
| A "dataplex-search-dq-scans" tool allows to search for data quality scans based on the provided parameters. | ||
| aliases: | ||
| - /resources/tools/dataplex-search-dq-scans | ||
| --- | ||
|
|
||
| ## About | ||
|
|
||
| A `dataplex-search-dq-scans` tool returns data quality scans that match the given criteria. | ||
| It's compatible with the following sources: | ||
|
|
||
| - [dataplex](../../sources/dataplex.md) | ||
|
|
||
| `dataplex-search-dq-scans` accepts the following optional parameters: | ||
|
|
||
| - `filter` - Filter string to search/filter data quality scans. E.g. "display_name = \"my-scan\"". | ||
| - `data_scan_id` - The ID of the data scan to filter by. | ||
| - `table_name` - The name of the table to filter by. | ||
Andres-Ayala1 marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| - `pageSize` - Number of returned data quality scans in the page. Defaults to `10`. | ||
| - `orderBy` - Specifies the ordering of results. | ||
|
|
||
| ## Requirements | ||
|
|
||
| ### IAM Permissions | ||
|
|
||
| Dataplex uses [Identity and Access Management (IAM)][iam-overview] to control | ||
| user and group access to Dataplex resources. Toolbox will use your | ||
| [Application Default Credentials (ADC)][adc] to authorize and authenticate when | ||
| interacting with [Dataplex][dataplex-docs]. | ||
|
|
||
| In addition to [setting the ADC for your server][set-adc], you need to ensure | ||
| the IAM identity has been given the correct IAM permissions for the tasks you | ||
| intend to perform. See [Dataplex Universal Catalog IAM permissions][iam-permissions] | ||
| and [Dataplex Universal Catalog IAM roles][iam-roles] for more information on | ||
| applying IAM permissions and roles to an identity. | ||
|
|
||
| [iam-overview]: https://cloud.google.com/dataplex/docs/iam-and-access-control | ||
| [adc]: https://cloud.google.com/docs/authentication#adc | ||
| [set-adc]: https://cloud.google.com/docs/authentication/provide-credentials-adc | ||
| [iam-permissions]: https://cloud.google.com/dataplex/docs/iam-permissions | ||
| [iam-roles]: https://cloud.google.com/dataplex/docs/iam-roles | ||
| [dataplex-docs]: https://cloud.google.com/dataplex | ||
|
|
||
| ## Example | ||
|
|
||
| ```yaml | ||
| kind: tools | ||
| name: dataplex-search-dq-scans | ||
| type: dataplex-search-dq-scans | ||
| source: my-dataplex-source | ||
| description: Use this tool to search for data quality scans. | ||
| ``` | ||
|
|
||
| ## Reference | ||
|
|
||
| | **field** | **type** | **required** | **description** | | ||
| |-------------|:--------:|:------------:|----------------------------------------------------| | ||
| | type | string | true | Must be "dataplex-search-dq-scans". | | ||
| | source | string | true | Name of the source the tool should execute on. | | ||
| | description | string | true | Description of the tool that is passed to the LLM. | | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
165 changes: 165 additions & 0 deletions
165
internal/tools/dataplex/dataplexsearchdqscans/dataplexsearchdqscans.go
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,165 @@ | ||
| // Copyright 2025 Google LLC | ||
| // | ||
| // Licensed under the Apache License, Version 2.0 (the "License"); | ||
| // you may not use this file except in compliance with the License. | ||
| // You may obtain a copy of the License at | ||
| // | ||
| // http://www.apache.org/licenses/LICENSE-2.0 | ||
| // | ||
| // Unless required by applicable law or agreed to in writing, software | ||
| // distributed under the License is distributed on an "AS IS" BASIS, | ||
| // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| // See the License for the specific language governing permissions and | ||
| // limitations under the License. | ||
|
|
||
| package dataplexsearchdqscans | ||
|
|
||
| import ( | ||
| "context" | ||
| "fmt" | ||
| "strings" | ||
|
|
||
| "cloud.google.com/go/dataplex/apiv1/dataplexpb" | ||
| "github.com/goccy/go-yaml" | ||
| "github.com/googleapis/genai-toolbox/internal/embeddingmodels" | ||
| "github.com/googleapis/genai-toolbox/internal/sources" | ||
| "github.com/googleapis/genai-toolbox/internal/tools" | ||
| "github.com/googleapis/genai-toolbox/internal/util/parameters" | ||
| ) | ||
|
|
||
| const resourceType string = "dataplex-search-dq-scans" | ||
|
|
||
| func init() { | ||
| if !tools.Register(resourceType, newConfig) { | ||
| panic(fmt.Sprintf("tool type %q already registered", resourceType)) | ||
| } | ||
| } | ||
|
|
||
| func newConfig(ctx context.Context, name string, decoder *yaml.Decoder) (tools.ToolConfig, error) { | ||
| actual := Config{Name: name} | ||
| if err := decoder.DecodeContext(ctx, &actual); err != nil { | ||
| return nil, err | ||
| } | ||
| return actual, nil | ||
| } | ||
|
|
||
| type compatibleSource interface { | ||
| SearchDataQualityScans(context.Context, string, int, string) ([]*dataplexpb.DataScan, error) | ||
| } | ||
|
|
||
| type Config struct { | ||
| Name string `yaml:"name" validate:"required"` | ||
| Type string `yaml:"type" validate:"required"` | ||
| Source string `yaml:"source" validate:"required"` | ||
| Description string `yaml:"description"` | ||
| AuthRequired []string `yaml:"authRequired"` | ||
| } | ||
|
|
||
| // validate interface | ||
| var _ tools.ToolConfig = Config{} | ||
|
|
||
| func (cfg Config) ToolConfigType() string { | ||
| return resourceType | ||
| } | ||
|
|
||
| func (cfg Config) Initialize(srcs map[string]sources.Source) (tools.Tool, error) { | ||
| filter := parameters.NewStringParameterWithDefault("filter", "", "Optional. Filter string to search/filter data quality scans. E.g. \"display_name = \\\"my-scan\\\"\"") | ||
| dataScanID := parameters.NewStringParameterWithDefault("data_scan_id", "", "Optional. The ID of the data scan to filter by.") | ||
Andres-Ayala1 marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| tableName := parameters.NewStringParameterWithDefault("table_name", "", "Optional. The name of the table to filter by.") | ||
| pageSize := parameters.NewIntParameterWithDefault("pageSize", 10, "Number of returned data quality scans in the page.") | ||
| orderBy := parameters.NewStringParameterWithDefault("orderBy", "", "Specifies the ordering of results.") | ||
| params := parameters.Parameters{filter, dataScanID, tableName, pageSize, orderBy} | ||
|
|
||
| mcpManifest := tools.GetMcpManifest(cfg.Name, cfg.Description, cfg.AuthRequired, params, nil) | ||
|
|
||
| t := Tool{ | ||
| Config: cfg, | ||
| Parameters: params, | ||
| manifest: tools.Manifest{ | ||
| Description: cfg.Description, | ||
| Parameters: params.Manifest(), | ||
| AuthRequired: cfg.AuthRequired, | ||
| }, | ||
| mcpManifest: mcpManifest, | ||
| } | ||
| return t, nil | ||
| } | ||
|
|
||
| type Tool struct { | ||
| Config | ||
| Parameters parameters.Parameters | ||
| manifest tools.Manifest | ||
| mcpManifest tools.McpManifest | ||
| } | ||
|
|
||
| func (t Tool) ToConfig() tools.ToolConfig { | ||
| return t.Config | ||
| } | ||
|
|
||
| func (t Tool) Invoke(ctx context.Context, resourceMgr tools.SourceProvider, params parameters.ParamValues, accessToken tools.AccessToken) (any, error) { | ||
| source, err := tools.GetCompatibleSource[compatibleSource](resourceMgr, t.Source, t.Name, t.Type) | ||
| if err != nil { | ||
| return nil, err | ||
| } | ||
| paramsMap := params.AsMap() | ||
| filter, _ := paramsMap["filter"].(string) | ||
| dataScanID, _ := paramsMap["data_scan_id"].(string) | ||
| tableName, _ := paramsMap["table_name"].(string) | ||
| pageSize, _ := paramsMap["pageSize"].(int) | ||
| orderBy, _ := paramsMap["orderBy"].(string) | ||
|
|
||
| var filters []string | ||
| if filter != "" { | ||
| filters = append(filters, filter) | ||
| } | ||
| if dataScanID != "" { | ||
| // assuming data_scan_id usually maps to name or id filter if supported, or display_name | ||
| // referencing user request "parameters could be datascan name" | ||
| // The generic filter "resource.name" or "display_name" is often used. | ||
| // Let's assume display_name for user convenience or name if full resource name. | ||
| // If it's just ID, we might need wildcard? | ||
| // Actually, `id` might be part of the resource name. | ||
| // Let's use `display_name` as it's more likely what user means by "name". | ||
| // Or if they mean ID, it might be `resource.name : id`. | ||
| // Let's try `display_name = "ID"` first as safe bet or just append to filter. | ||
| filters = append(filters, fmt.Sprintf("display_name = %q", dataScanID)) | ||
Andres-Ayala1 marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
Andres-Ayala1 marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| } | ||
| if tableName != "" { | ||
| // "data.entity" is typically used for table in DataScan filters | ||
| filters = append(filters, fmt.Sprintf("data.entity = %q", tableName)) | ||
| } | ||
|
|
||
| finalFilter := strings.Join(filters, " AND ") | ||
|
|
||
| return source.SearchDataQualityScans(ctx, finalFilter, pageSize, orderBy) | ||
| } | ||
|
|
||
| func (t Tool) EmbedParams(ctx context.Context, paramValues parameters.ParamValues, embeddingModelsMap map[string]embeddingmodels.EmbeddingModel) (parameters.ParamValues, error) { | ||
| return parameters.EmbedParams(ctx, t.Parameters, paramValues, embeddingModelsMap, nil) | ||
| } | ||
|
|
||
| func (t Tool) Manifest() tools.Manifest { | ||
| // Returns the tool manifest | ||
| return t.manifest | ||
| } | ||
|
|
||
| func (t Tool) McpManifest() tools.McpManifest { | ||
| // Returns the tool MCP manifest | ||
| return t.mcpManifest | ||
| } | ||
|
|
||
| func (t Tool) Authorized(verifiedAuthServices []string) bool { | ||
| return tools.IsAuthorized(t.AuthRequired, verifiedAuthServices) | ||
| } | ||
|
|
||
| func (t Tool) RequiresClientAuthorization(resourceMgr tools.SourceProvider) (bool, error) { | ||
| return false, nil | ||
| } | ||
|
|
||
| func (t Tool) GetAuthTokenHeaderName(resourceMgr tools.SourceProvider) (string, error) { | ||
| return "Authorization", nil | ||
| } | ||
|
|
||
| func (t Tool) GetParameters() parameters.Parameters { | ||
| return t.Parameters | ||
| } | ||
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.