Skip to content

Commit 62b626c

Browse files
committed
feat(source/cloud-storage): add Cloud Storage source with list_objects and read_object tools
Adds a new project-scoped `cloud-storage` source using ADC, plus two read-only tools: `cloud-storage-list-objects` (with prefix/delimiter/pagination) and `cloud-storage-read-object` (with HTTP-style byte range and base64 payload). Introduces a GCS-aware error classifier in `cloudstoragecommon` that splits failures into Agent errors (missing bucket/object, bad request, unsatisfiable range) and Server errors (auth, IAM denial, quota, 5xx, cancellation) per DEVELOPER.md, replacing the coarse-grained `util.ProcessGcpError`. Ships YAML-parse unit tests, an error-classifier unit test, a range-parser unit test, a live-GCS integration test (12 sub-tests, UUID-suffixed bucket with self-cleanup), docs under `docs/en/integrations/cloud-storage/`, and a `cloud-storage` CI shard. The remaining 12 tools from the approved design doc land in follow-up PRs.
1 parent 9e543d7 commit 62b626c

File tree

19 files changed

+1721
-8
lines changed

19 files changed

+1721
-8
lines changed

.ci/integration.cloudbuild.yaml

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -448,6 +448,34 @@ steps:
448448
exit 0
449449
fi
450450
451+
- id: "cloud-storage"
452+
name: golang:1
453+
waitFor: ["compile-test-binary", "detect-changes"]
454+
entrypoint: /bin/bash
455+
env:
456+
- "GOPATH=/gopath"
457+
- "CLOUD_STORAGE_PROJECT=$PROJECT_ID"
458+
- "SERVICE_ACCOUNT_EMAIL=$SERVICE_ACCOUNT_EMAIL"
459+
secretEnv: ["CLIENT_ID"]
460+
volumes:
461+
- name: "go"
462+
path: "/gopath"
463+
args:
464+
- -c
465+
- |
466+
PATTERN="cloudstorage|internal/server/|.ci/"
467+
468+
if grep -qE "$$PATTERN" /workspace/changed_files.txt; then
469+
echo "Relevant changes detected. Running Cloud Storage tests..."
470+
.ci/test_with_coverage.sh \
471+
"Cloud Storage" \
472+
cloudstorage \
473+
cloudstorage
474+
else
475+
echo "No relevant changes for Cloud Storage. Skipping shard."
476+
exit 0
477+
fi
478+
451479
- id: "postgres"
452480
name: golang:1
453481
waitFor: ["compile-test-binary", "detect-changes"]

cmd/internal/imports.go

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -83,6 +83,8 @@ import (
8383
_ "github.com/googleapis/mcp-toolbox/internal/tools/cloudsqlpg/vectorassistdefinespec"
8484
_ "github.com/googleapis/mcp-toolbox/internal/tools/cloudsqlpg/vectorassistgeneratequery"
8585
_ "github.com/googleapis/mcp-toolbox/internal/tools/cloudsqlpg/vectorassistmodifyspec"
86+
_ "github.com/googleapis/mcp-toolbox/internal/tools/cloudstorage/cloudstoragelistobjects"
87+
_ "github.com/googleapis/mcp-toolbox/internal/tools/cloudstorage/cloudstoragereadobject"
8688
_ "github.com/googleapis/mcp-toolbox/internal/tools/cockroachdb/cockroachdbexecutesql"
8789
_ "github.com/googleapis/mcp-toolbox/internal/tools/cockroachdb/cockroachdblistschemas"
8890
_ "github.com/googleapis/mcp-toolbox/internal/tools/cockroachdb/cockroachdblisttables"
@@ -261,6 +263,7 @@ import (
261263
_ "github.com/googleapis/mcp-toolbox/internal/sources/cloudsqlmssql"
262264
_ "github.com/googleapis/mcp-toolbox/internal/sources/cloudsqlmysql"
263265
_ "github.com/googleapis/mcp-toolbox/internal/sources/cloudsqlpg"
266+
_ "github.com/googleapis/mcp-toolbox/internal/sources/cloudstorage"
264267
_ "github.com/googleapis/mcp-toolbox/internal/sources/cockroachdb"
265268
_ "github.com/googleapis/mcp-toolbox/internal/sources/couchbase"
266269
_ "github.com/googleapis/mcp-toolbox/internal/sources/dataplex"
Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
---
2+
title: "Cloud Storage"
3+
weight: 1
4+
---
Lines changed: 69 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,69 @@
1+
---
2+
title: "Cloud Storage Source"
3+
linkTitle: "Source"
4+
type: docs
5+
weight: 1
6+
description: >
7+
Cloud Storage is Google Cloud's managed service for storing unstructured objects (files) in buckets. Toolbox connects at the project level, allowing tools to list, read, and manage objects across any bucket the credentials can access.
8+
no_list: true
9+
---
10+
11+
## About
12+
13+
[Cloud Storage][gcs-docs] is Google Cloud's managed service for storing
14+
unstructured data (blobs) in containers called *buckets*. Buckets live in a GCP
15+
project; objects are addressed by `gs://<bucket>/<object>`.
16+
17+
If you are new to Cloud Storage, you can try the
18+
[quickstart][gcs-quickstart] to create a bucket and upload your first objects.
19+
20+
The Cloud Storage source is configured at the **project** level. Individual
21+
tools take a `bucket` parameter, so a single configured source can operate
22+
against any bucket the underlying credentials are authorized for.
23+
24+
[gcs-docs]: https://cloud.google.com/storage/docs
25+
[gcs-quickstart]: https://cloud.google.com/storage/docs/discover-object-storage-console
26+
27+
## Available Tools
28+
29+
{{< list-tools >}}
30+
31+
## Requirements
32+
33+
### IAM Permissions
34+
35+
Cloud Storage uses [Identity and Access Management (IAM)][iam-overview] to
36+
control access to buckets and objects. Toolbox uses your
37+
[Application Default Credentials (ADC)][adc] to authorize and authenticate when
38+
interacting with Cloud Storage.
39+
40+
In addition to [setting the ADC for your server][set-adc], ensure the IAM
41+
identity has the appropriate role for the tools being exposed. Common roles:
42+
43+
- `roles/storage.objectViewer` — read-only access to objects (sufficient for
44+
`cloud-storage-list-objects` and `cloud-storage-read-object`)
45+
- `roles/storage.objectUser` — read and write access to objects
46+
- `roles/storage.admin` — full control, including bucket management
47+
48+
See [Cloud Storage IAM roles][gcs-iam] for the full list.
49+
50+
[iam-overview]: https://cloud.google.com/storage/docs/access-control/iam
51+
[adc]: https://cloud.google.com/docs/authentication#adc
52+
[set-adc]: https://cloud.google.com/docs/authentication/provide-credentials-adc
53+
[gcs-iam]: https://cloud.google.com/storage/docs/access-control/iam-roles
54+
55+
## Example
56+
57+
```yaml
58+
kind: source
59+
name: my-gcs-source
60+
type: "cloud-storage"
61+
project: "my-project-id"
62+
```
63+
64+
## Reference
65+
66+
| **field** | **type** | **required** | **description** |
67+
|-----------|:--------:|:------------:|---------------------------------------------------------------------------------|
68+
| type | string | true | Must be "cloud-storage". |
69+
| project | string | true | Id of the GCP project the configured source is associated with (e.g. "my-project-id"). |
Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
---
2+
title: "Tools"
3+
weight: 2
4+
---
Lines changed: 56 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,56 @@
1+
---
2+
title: "cloud-storage-list-objects"
3+
type: docs
4+
weight: 1
5+
description: >
6+
A "cloud-storage-list-objects" tool lists objects in a Cloud Storage bucket, with optional prefix filtering and delimiter-based grouping.
7+
---
8+
9+
## About
10+
11+
A `cloud-storage-list-objects` tool returns the objects in a
12+
[Cloud Storage bucket][gcs-buckets]. It supports the usual GCS listing options:
13+
14+
- `prefix` — filter results to objects whose names begin with the given string.
15+
- `delimiter` — group results by this character (typically `/`) so subdirectory-like
16+
"common prefixes" are returned separately from the leaf objects.
17+
- `max_results` / `page_token` — paginate through large listings.
18+
19+
The response is a JSON object with `objects` (the full object metadata as
20+
returned by the Cloud Storage API — fields such as `Name`, `Size`, `ContentType`,
21+
`Updated`, `StorageClass`, `MD5`, etc.), `prefixes` (the common prefixes when
22+
`delimiter` is set), and `nextPageToken` (empty when there are no more pages).
23+
24+
[gcs-buckets]: https://cloud.google.com/storage/docs/buckets
25+
26+
## Compatible Sources
27+
28+
{{< compatible-sources >}}
29+
30+
## Parameters
31+
32+
| **parameter** | **type** | **required** | **description** |
33+
|---------------|:--------:|:------------:|-------------------------------------------------------------------------------------------------------------------|
34+
| bucket | string | true | Name of the Cloud Storage bucket to list objects from. |
35+
| prefix | string | false | Filter results to objects whose names begin with this prefix. |
36+
| delimiter | string | false | Delimiter used to group object names (typically '/'). When set, common prefixes are returned as `prefixes`. |
37+
| max_results | integer | false | Maximum number of objects to return per page. A value of 0 uses the API default (1000); the maximum allowed is 1000. |
38+
| page_token | string | false | A previously-returned page token for retrieving the next page of results. |
39+
40+
## Example
41+
42+
```yaml
43+
kind: tool
44+
name: list_objects
45+
type: cloud-storage-list-objects
46+
source: my-gcs-source
47+
description: Use this tool to list objects in a Cloud Storage bucket.
48+
```
49+
50+
## Reference
51+
52+
| **field** | **type** | **required** | **description** |
53+
|-------------|:--------:|:------------:|---------------------------------------------------------|
54+
| type | string | true | Must be "cloud-storage-list-objects". |
55+
| source | string | true | Name of the Cloud Storage source to list objects from. |
56+
| description | string | true | Description of the tool that is passed to the LLM. |
Lines changed: 50 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,50 @@
1+
---
2+
title: "cloud-storage-read-object"
3+
type: docs
4+
weight: 2
5+
description: >
6+
A "cloud-storage-read-object" tool reads the content of a Cloud Storage object and returns it as a base64-encoded string, optionally constrained to a byte range.
7+
---
8+
9+
## About
10+
11+
A `cloud-storage-read-object` tool fetches the bytes of a single
12+
[Cloud Storage object][gcs-objects] and returns them base64-encoded so that
13+
arbitrary binary content can be round-tripped through JSON safely. For large
14+
objects, prefer the optional `range` parameter to read only the bytes you need.
15+
16+
This tool is intended for small-to-medium textual or binary content an LLM can
17+
process directly. For bulk downloads of large files to the local filesystem,
18+
use `cloud-storage-download-object` (coming in a follow-up release).
19+
20+
[gcs-objects]: https://cloud.google.com/storage/docs/objects
21+
22+
## Compatible Sources
23+
24+
{{< compatible-sources >}}
25+
26+
## Parameters
27+
28+
| **parameter** | **type** | **required** | **description** |
29+
|---------------|:--------:|:------------:|---------------------------------------------------------------------------------------------------------------------------------------------------|
30+
| bucket | string | true | Name of the Cloud Storage bucket containing the object. |
31+
| object | string | true | Full object name (path) within the bucket, e.g. `path/to/file.txt`. |
32+
| range | string | false | Optional HTTP byte range, e.g. `bytes=0-999` (first 1000 bytes), `bytes=-500` (last 500 bytes), or `bytes=500-` (from byte 500 to end). Empty reads the full object. |
33+
34+
## Example
35+
36+
```yaml
37+
kind: tool
38+
name: read_object
39+
type: cloud-storage-read-object
40+
source: my-gcs-source
41+
description: Use this tool to read the content of a Cloud Storage object.
42+
```
43+
44+
## Reference
45+
46+
| **field** | **type** | **required** | **description** |
47+
|-------------|:--------:|:------------:|---------------------------------------------------------|
48+
| type | string | true | Must be "cloud-storage-read-object". |
49+
| source | string | true | Name of the Cloud Storage source to read the object from. |
50+
| description | string | true | Description of the tool that is passed to the LLM. |

go.mod

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,7 @@ require (
1616
cloud.google.com/go/logging v1.13.2
1717
cloud.google.com/go/longrunning v0.9.0
1818
cloud.google.com/go/spanner v1.89.0
19+
cloud.google.com/go/storage v1.62.1
1920
github.com/ClickHouse/clickhouse-go/v2 v2.44.0
2021
github.com/GoogleCloudPlatform/opentelemetry-operations-go/exporter/metric v0.55.0
2122
github.com/GoogleCloudPlatform/opentelemetry-operations-go/exporter/trace v1.31.0
@@ -96,7 +97,7 @@ require (
9697
cloud.google.com/go/auth v0.19.0 // indirect
9798
cloud.google.com/go/auth/oauth2adapt v0.2.8 // indirect
9899
cloud.google.com/go/compute/metadata v0.9.0 // indirect
99-
cloud.google.com/go/iam v1.6.0 // indirect
100+
cloud.google.com/go/iam v1.7.0 // indirect
100101
cloud.google.com/go/monitoring v1.24.3 // indirect
101102
cloud.google.com/go/trace v1.11.7 // indirect
102103
dario.cat/mergo v1.0.2 // indirect
@@ -184,7 +185,7 @@ require (
184185
github.com/google/pprof v0.0.0-20260115054156-294ebfa9ad83 // indirect
185186
github.com/google/s2a-go v0.1.9 // indirect
186187
github.com/googleapis/enterprise-certificate-proxy v0.3.14 // indirect
187-
github.com/googleapis/gax-go/v2 v2.20.0 // indirect
188+
github.com/googleapis/gax-go/v2 v2.21.0 // indirect
188189
github.com/gorilla/websocket v1.5.3 // indirect
189190
github.com/grpc-ecosystem/grpc-gateway/v2 v2.28.0 // indirect
190191
github.com/gsterjov/go-libsecret v0.0.0-20161001094733-a6f4afe4910c // indirect

go.sum

Lines changed: 8 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -29,8 +29,8 @@ cloud.google.com/go/firestore v1.21.0 h1:BhopUsx7kh6NFx77ccRsHhrtkbJUmDAxNY3uapW
2929
cloud.google.com/go/firestore v1.21.0/go.mod h1:1xH6HNcnkf/gGyR8udd6pFO4Z7GWJSwLKQMx/u6UrP4=
3030
cloud.google.com/go/geminidataanalytics v0.9.0 h1:JeuMqZp4PlD9i0TLv+c2oTZe+UEatmF6oRC582TwxD8=
3131
cloud.google.com/go/geminidataanalytics v0.9.0/go.mod h1:IUJTO6abmPj5LNZWUqohEEeHmTgLDlIBFwwRC6VJZgM=
32-
cloud.google.com/go/iam v1.6.0 h1:JiSIcEi38dWBKhB3BtfKCW+dMvCZJEhBA2BsaGJgoxs=
33-
cloud.google.com/go/iam v1.6.0/go.mod h1:ZS6zEy7QHmcNO18mjO2viYv/n+wOUkhJqGNkPPGueGU=
32+
cloud.google.com/go/iam v1.7.0 h1:JD3zh0C6LHl16aCn5Akff0+GELdp1+4hmh6ndoFLl8U=
33+
cloud.google.com/go/iam v1.7.0/go.mod h1:tetWZW1PD/m6vcuY2Zj/aU0eCHNPuxedbnbRTyKXvdY=
3434
cloud.google.com/go/logging v1.13.2 h1:qqlHCBvieJT9Cdq4QqYx1KPadCQ2noD4FK02eNqHAjA=
3535
cloud.google.com/go/logging v1.13.2/go.mod h1:zaybliM3yun1J8mU2dVQ1/qDzjbOqEijZCn6hSBtKak=
3636
cloud.google.com/go/longrunning v0.9.0 h1:0EzbDEGsAvOZNbqXopgniY0w0a1phvu5IdUFq8grmqY=
@@ -39,8 +39,8 @@ cloud.google.com/go/monitoring v1.24.3 h1:dde+gMNc0UhPZD1Azu6at2e79bfdztVDS5lvhO
3939
cloud.google.com/go/monitoring v1.24.3/go.mod h1:nYP6W0tm3N9H/bOw8am7t62YTzZY+zUeQ+Bi6+2eonI=
4040
cloud.google.com/go/spanner v1.89.0 h1:r3h5Z5RA8JRPf3HCvA6ujNhREIMhPY+MrDL9mkY8jS0=
4141
cloud.google.com/go/spanner v1.89.0/go.mod h1:okNuxnp1wdPaVoM5M28Al2irKZLkHhZ2Z+DW6/ZJWGw=
42-
cloud.google.com/go/storage v1.61.3 h1:VS//ZfBuPGDvakfD9xyPW1RGF1Vy3BWUoVZXgW1KMOg=
43-
cloud.google.com/go/storage v1.61.3/go.mod h1:JtqK8BBB7TWv0HVGHubtUdzYYrakOQIsMLffZ2Z/HWk=
42+
cloud.google.com/go/storage v1.62.1 h1:Os0G3XbUbjZumkpDUf2Y0rLoXJTCF1kU2kWUujKYXD8=
43+
cloud.google.com/go/storage v1.62.1/go.mod h1:cpYz/kRVZ+UQAF1uHeea10/9ewcRbxGoGNKsS9daSXA=
4444
cloud.google.com/go/trace v1.11.7 h1:kDNDX8JkaAG3R2nq1lIdkb7FCSi1rCmsEtKVsty7p+U=
4545
cloud.google.com/go/trace v1.11.7/go.mod h1:TNn9d5V3fQVf6s4SCveVMIBS2LJUqo73GACmq/Tky0s=
4646
dario.cat/mergo v1.0.2 h1:85+piFYR1tMbRrLcDwR18y4UKJ3aH1Tbzi24VRW1TK8=
@@ -364,8 +364,8 @@ github.com/google/uuid v1.6.0 h1:NIvaJDMOsjHA8n1jAhLSgzrAzy1Hgr+hNrb57e+94F0=
364364
github.com/google/uuid v1.6.0/go.mod h1:TIyPZe4MgqvfeYDBFedMoGGpEw/LqOeaOT+nhxU+yHo=
365365
github.com/googleapis/enterprise-certificate-proxy v0.3.14 h1:yh8ncqsbUY4shRD5dA6RlzjJaT4hi3kII+zYw8wmLb8=
366366
github.com/googleapis/enterprise-certificate-proxy v0.3.14/go.mod h1:vqVt9yG9480NtzREnTlmGSBmFrA+bzb0yl0TxoBQXOg=
367-
github.com/googleapis/gax-go/v2 v2.20.0 h1:NIKVuLhDlIV74muWlsMM4CcQZqN6JJ20Qcxd9YMuYcs=
368-
github.com/googleapis/gax-go/v2 v2.20.0/go.mod h1:But/NJU6TnZsrLai/xBAQLLz+Hc7fHZJt/hsCz3Fih4=
367+
github.com/googleapis/gax-go/v2 v2.21.0 h1:h45NjjzEO3faG9Lg/cFrBh2PgegVVgzqKzuZl/wMbiI=
368+
github.com/googleapis/gax-go/v2 v2.21.0/go.mod h1:But/NJU6TnZsrLai/xBAQLLz+Hc7fHZJt/hsCz3Fih4=
369369
github.com/gorilla/securecookie v1.1.1 h1:miw7JPhV+b/lAHSXz4qd/nN9jRiAFV5FwjeKyCS8BvQ=
370370
github.com/gorilla/securecookie v1.1.1/go.mod h1:ra0sb63/xPlUeL+yeDciTfxMRAA+MP+HVt/4epWDjd4=
371371
github.com/gorilla/sessions v1.2.1 h1:DHd3rPN5lE3Ts3D8rKkQ8x/0kqfeNmBAaiSi+o7FsgI=
@@ -656,6 +656,8 @@ go.opentelemetry.io/otel/exporters/otlp/otlptrace v1.43.0 h1:88Y4s2C8oTui1LGM6bT
656656
go.opentelemetry.io/otel/exporters/otlp/otlptrace v1.43.0/go.mod h1:Vl1/iaggsuRlrHf/hfPJPvVag77kKyvrLeD10kpMl+A=
657657
go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp v1.43.0 h1:3iZJKlCZufyRzPzlQhUIWVmfltrXuGyfjREgGP3UUjc=
658658
go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp v1.43.0/go.mod h1:/G+nUPfhq2e+qiXMGxMwumDrP5jtzU+mWN7/sjT2rak=
659+
go.opentelemetry.io/otel/exporters/stdout/stdoutmetric v1.43.0 h1:TC+BewnDpeiAmcscXbGMfxkO+mwYUwE/VySwvw88PfA=
660+
go.opentelemetry.io/otel/exporters/stdout/stdoutmetric v1.43.0/go.mod h1:J/ZyF4vfPwsSr9xJSPyQ4LqtcTPULFR64KwTikGLe+A=
659661
go.opentelemetry.io/otel/metric v1.43.0 h1:d7638QeInOnuwOONPp4JAOGfbCEpYb+K6DVWvdxGzgM=
660662
go.opentelemetry.io/otel/metric v1.43.0/go.mod h1:RDnPtIxvqlgO8GRW18W6Z/4P462ldprJtfxHxyKd2PY=
661663
go.opentelemetry.io/otel/sdk v1.43.0 h1:pi5mE86i5rTeLXqoF/hhiBtUNcrAGHLKQdhg4h4V9Dg=

0 commit comments

Comments
 (0)