Skip to content

Commit e43e2aa

Browse files
caohy1988claude
andcommitted
feat: add 1P BigQuery skill for guided data analysis workflows
Pre-packaged BigQuery data analysis skill following the agentskills.io specification. Users combine BigQueryToolset (raw tools) with SkillToolset (curated guidance) for progressive disclosure of workflow instructions and reference materials. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1 parent 991abd4 commit e43e2aa

File tree

11 files changed

+1122
-1
lines changed

11 files changed

+1122
-1
lines changed
Lines changed: 267 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,267 @@
1+
# Design: First-Party (1P) Skills for ADK Toolsets
2+
3+
## Problem
4+
5+
ADK toolsets like `BigQueryToolset` provide raw tools (e.g., `execute_sql`,
6+
`list_dataset_ids`) but no guidance on how to use them effectively. Developers
7+
must re-invent prompt engineering for each toolset, embedding workflow
8+
knowledge directly in agent instructions. This leads to:
9+
10+
- Duplicated effort across agent builders.
11+
- Inconsistent quality of analysis workflows.
12+
- No standard way to share toolset expertise.
13+
- Agent instructions that grow unwieldy as guidance accumulates.
14+
15+
## Solution
16+
17+
Pre-packaged skills that follow the
18+
[agentskills.io specification](https://agentskills.io/specification),
19+
consumed via ADK's existing `SkillToolset`. Zero new APIs, zero new classes.
20+
21+
A 1P skill is simply a spec-compliant skill directory that ships with ADK
22+
alongside its corresponding toolset. Users add both the toolset (for tools)
23+
and a `SkillToolset` (for guided workflows) to their agent.
24+
25+
```python
26+
# Before: raw toolset, no guidance
27+
root_agent = LlmAgent(tools=[bigquery_toolset])
28+
29+
# After: toolset + 1P skill for guided workflows
30+
bq_skill_toolset = SkillToolset(skills=[get_bigquery_skill()])
31+
root_agent = LlmAgent(tools=[bigquery_toolset, bq_skill_toolset])
32+
```
33+
34+
## How It Works
35+
36+
### Progressive Disclosure
37+
38+
The skill content is loaded in three levels, keeping context efficient:
39+
40+
1. **L1 - Metadata** (always in context): Skill name and description are
41+
returned by `list_skills`. The LLM sees what skills are available without
42+
loading full instructions.
43+
44+
2. **L2 - Instructions** (loaded on activation): When the LLM calls
45+
`load_skill(name="bigquery-data-analysis")`, it receives the SKILL.md
46+
body with step-by-step workflow guidance.
47+
48+
3. **L3 - References** (loaded on demand): When the LLM needs detailed
49+
patterns, it calls `load_skill_resource` to load specific reference
50+
files (e.g., `sql_patterns.md`, `error_handling.md`).
51+
52+
### Runtime Flow
53+
54+
```
55+
1. Agent starts -> SkillToolset injects skill system instruction
56+
2. User asks question -> LLM sees list_skills tool available
57+
3. LLM calls list_skills -> sees "bigquery-data-analysis" skill
58+
4. LLM calls load_skill("bigquery-data-analysis") -> gets workflow steps
59+
5. LLM follows steps, using BigQuery tools (execute_sql, etc.)
60+
6. LLM calls load_skill_resource for detailed patterns as needed
61+
```
62+
63+
### Directory Structure
64+
65+
```
66+
src/google/adk/tools/bigquery/
67+
├── bigquery_toolset.py # Existing: raw tools
68+
├── bigquery_skill.py # New: get_bigquery_skill() loader
69+
└── skills/
70+
└── bigquery-data-analysis/ # Spec-compliant skill directory
71+
├── SKILL.md # Frontmatter + workflow instructions
72+
└── references/
73+
├── sql_patterns.md
74+
├── schema_exploration.md
75+
└── error_handling.md
76+
```
77+
78+
## API Usage
79+
80+
### Before (tools only)
81+
82+
```python
83+
from google.adk.agents.llm_agent import LlmAgent
84+
from google.adk.tools.bigquery.bigquery_toolset import BigQueryToolset
85+
86+
bigquery_toolset = BigQueryToolset(credentials_config=creds)
87+
88+
root_agent = LlmAgent(
89+
model="gemini-2.5-flash",
90+
name="analyst",
91+
instruction="""You are a data analyst. When analyzing data:
92+
1. First explore schemas with list_dataset_ids, list_table_ids...
93+
2. Use get_table_info before writing queries...
94+
3. Always use LIMIT on exploratory queries...
95+
4. Use CTEs for complex queries...
96+
5. Handle errors by checking get_job_info...
97+
... (many lines of hand-written guidance)""",
98+
tools=[bigquery_toolset],
99+
)
100+
```
101+
102+
### After (tools + 1P skill)
103+
104+
```python
105+
from google.adk.agents.llm_agent import LlmAgent
106+
from google.adk.tools.bigquery.bigquery_toolset import BigQueryToolset
107+
from google.adk.tools.bigquery.bigquery_skill import get_bigquery_skill
108+
from google.adk.tools.skill_toolset import SkillToolset
109+
110+
bigquery_toolset = BigQueryToolset(credentials_config=creds)
111+
bq_skill_toolset = SkillToolset(skills=[get_bigquery_skill()])
112+
113+
root_agent = LlmAgent(
114+
model="gemini-2.5-flash",
115+
name="analyst",
116+
instruction="You are a data analyst. Use your tools and skills.",
117+
tools=[bigquery_toolset, bq_skill_toolset],
118+
)
119+
```
120+
121+
The curated guidance moves from fragile inline instructions into a
122+
structured, versioned, spec-compliant skill that the agent discovers
123+
and loads at runtime.
124+
125+
### Composability
126+
127+
`BigQueryToolset` and `SkillToolset` are fully independent — neither
128+
depends on nor references the other. The 1P skill is opt-in; nothing
129+
auto-includes it. This means all of the following patterns work:
130+
131+
```python
132+
# BigQuery toolset + your own custom skills (no 1P BQ skill)
133+
my_skill = load_skill_from_dir("path/to/my-custom-skill")
134+
root_agent = LlmAgent(
135+
tools=[
136+
BigQueryToolset(credentials_config=creds),
137+
SkillToolset(skills=[my_skill]),
138+
],
139+
)
140+
```
141+
142+
```python
143+
# BigQuery toolset + 1P BQ skill + your own skills (all together)
144+
root_agent = LlmAgent(
145+
tools=[
146+
BigQueryToolset(credentials_config=creds),
147+
SkillToolset(skills=[get_bigquery_skill(), my_skill]),
148+
],
149+
)
150+
```
151+
152+
```python
153+
# BigQuery toolset alone, no skills at all
154+
root_agent = LlmAgent(
155+
tools=[BigQueryToolset(credentials_config=creds)],
156+
)
157+
```
158+
159+
Users choose exactly which skills to include. The `get_bigquery_skill()`
160+
loader is a convenience, not a coupling.
161+
162+
## Why This Design Is Minimal
163+
164+
This design achieves guided workflows with the absolute minimum change
165+
to the existing API surface:
166+
167+
1. **No behavioral changes** to `BigQueryToolset`, `SkillToolset`,
168+
`LlmAgent`, or the runner flow.
169+
2. **No signature changes** or breaking changes to existing public APIs.
170+
3. **Entirely additive**: a packaged skill directory + a thin loader +
171+
sample + tests.
172+
4. **Opt-in**: existing user patterns work unchanged; the new pattern
173+
is `tools=[bigquery_toolset, skill_toolset]`.
174+
175+
### Trade-off: Minimalism vs. Ergonomics
176+
177+
For minimum API churn, this is the right design. A more ergonomic
178+
single-line UX (e.g., `BigQueryToolset(include_skill=True)`) would
179+
require new convenience APIs, increasing surface area and review risk.
180+
The current two-line pattern keeps the toolset and skill concerns
181+
cleanly separated.
182+
183+
### Public Surface Note
184+
185+
`get_bigquery_skill` is exported from `google.adk.tools.bigquery` for
186+
discoverability alongside `BigQueryToolset`. This is still additive and
187+
acceptable. For absolute-minimum public surface, it could instead be
188+
kept as an import from `google.adk.tools.bigquery.bigquery_skill` only.
189+
190+
## Repeatable Template for New Toolsets
191+
192+
The pattern scales cleanly to Spanner, Bigtable, PubSub, and other
193+
toolsets without changing existing core APIs. Follow these steps per
194+
toolset:
195+
196+
1. Add a spec-compliant skill directory under
197+
`src/google/adk/tools/<toolset>/skills/<skill-name>/`.
198+
2. Add a thin loader `get_<toolset>_skill()` that calls
199+
`load_skill_from_dir(...)`.
200+
3. (Optional but recommended) Export the loader in
201+
`src/google/adk/tools/<toolset>/__init__.py`.
202+
4. Add tests for skill validity + `SkillToolset` integration.
203+
5. Add a sample showing
204+
`tools=[<Toolset>(...), SkillToolset(skills=[get_<toolset>_skill()])]`.
205+
206+
### 1. Create a Spec-Compliant Skill Directory
207+
208+
```
209+
src/google/adk/tools/<toolset>/skills/<skill-name>/
210+
├── SKILL.md # Required: YAML frontmatter + instructions
211+
└── references/ # Optional: detailed reference materials
212+
└── ...
213+
```
214+
215+
The directory name must match the `name` field in SKILL.md frontmatter.
216+
217+
### 2. Add a Convenience Loader
218+
219+
```python
220+
# src/google/adk/tools/<toolset>/<toolset>_skill.py
221+
222+
import pathlib
223+
from google.adk.skills import Skill, load_skill_from_dir
224+
225+
_SKILL_DIR = pathlib.Path(__file__).parent / "skills" / "<skill-name>"
226+
227+
def get_<toolset>_skill() -> Skill:
228+
return load_skill_from_dir(_SKILL_DIR)
229+
```
230+
231+
### 3. Users Combine Toolset + SkillToolset
232+
233+
```python
234+
from google.adk.tools.<toolset> import <Toolset>
235+
from google.adk.tools.<toolset>.<toolset>_skill import get_<toolset>_skill
236+
from google.adk.tools.skill_toolset import SkillToolset
237+
238+
toolset = <Toolset>(...)
239+
skill_toolset = SkillToolset(skills=[get_<toolset>_skill()])
240+
agent = LlmAgent(tools=[toolset, skill_toolset])
241+
```
242+
243+
### Candidate Toolsets
244+
245+
- **Spanner**: Schema design, transaction patterns, query optimization.
246+
- **Bigtable**: Row key design, filter patterns, scan optimization.
247+
- **PubSub**: Topic/subscription setup, message handling, dead-letter queues.
248+
249+
## Spec Compliance
250+
251+
The skill directory maps to [agentskills.io](https://agentskills.io/specification)
252+
fields as follows:
253+
254+
| Spec Field | Source |
255+
|------------|--------|
256+
| `name` | SKILL.md frontmatter `name` (must match directory name) |
257+
| `description` | SKILL.md frontmatter `description` |
258+
| `license` | SKILL.md frontmatter `license` |
259+
| `metadata` | SKILL.md frontmatter `metadata` |
260+
| `instructions` | SKILL.md body (after frontmatter) |
261+
| `references` | `references/` directory (loaded by `load_skill_resource`) |
262+
| `assets` | `assets/` directory (not used by this skill) |
263+
| `scripts` | `scripts/` directory (not used by this skill) |
264+
265+
ADK's `load_skill_from_dir()` validates name-directory match, parses YAML
266+
frontmatter, and loads all resource directories. `SkillToolset` provides
267+
the standard tools for skill discovery, loading, and resource access.
Lines changed: 79 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,79 @@
1+
# 1P BigQuery Skill Sample
2+
3+
This sample demonstrates the **1P (first-party) Skills** pattern: combining
4+
a raw toolset (`BigQueryToolset`) with a curated skill (`SkillToolset`) to
5+
give the agent both tools and guided workflows.
6+
7+
## What This Shows
8+
9+
- **BigQueryToolset** provides raw tools: `execute_sql`, `list_dataset_ids`,
10+
`get_table_info`, etc.
11+
- **SkillToolset** provides skill discovery and loading: `list_skills`,
12+
`load_skill`, `load_skill_resource`, `run_skill_script`.
13+
- The **bigquery-data-analysis** skill ships pre-packaged with ADK and
14+
follows the [agentskills.io specification](https://agentskills.io/specification).
15+
16+
The agent can discover the skill at runtime, load its instructions for
17+
guided workflows, and access reference materials on-demand.
18+
19+
## Setup
20+
21+
1. Install ADK with BigQuery extras:
22+
23+
```bash
24+
pip install google-adk[bigquery]
25+
```
26+
27+
2. Set up OAuth credentials:
28+
29+
- Create OAuth 2.0 credentials in the Google Cloud Console.
30+
- Update `agent.py` with your `client_id` and `client_secret`.
31+
32+
3. Run the sample:
33+
34+
```bash
35+
adk web contributing/samples
36+
```
37+
38+
4. Select `1p_bigquery_skill` from the agent list.
39+
40+
## How It Works
41+
42+
```
43+
User Query
44+
|
45+
v
46+
LlmAgent (gemini-2.5-flash)
47+
|
48+
+-- BigQueryToolset tools (direct data access)
49+
| list_dataset_ids, list_table_ids, get_table_info,
50+
| execute_sql, forecast, detect_anomalies, ...
51+
|
52+
+-- SkillToolset tools (guided workflows)
53+
list_skills -> discovers "bigquery-data-analysis"
54+
load_skill -> loads step-by-step instructions
55+
load_skill_resource -> loads sql_patterns.md, etc.
56+
run_skill_script -> executes skill scripts
57+
```
58+
59+
## Progressive Disclosure
60+
61+
The skill uses three levels of content:
62+
63+
1. **L1 - Metadata** (always available): skill name and description shown
64+
via `list_skills`.
65+
2. **L2 - Instructions** (on activation): full workflow steps loaded via
66+
`load_skill`.
67+
3. **L3 - References** (on demand): detailed SQL patterns, schema
68+
exploration guides, and error handling loaded via `load_skill_resource`.
69+
70+
This keeps the agent's context efficient while making deep knowledge
71+
available when needed.
72+
73+
## Extending This Pattern
74+
75+
Other toolsets can follow the same pattern:
76+
77+
1. Create a spec-compliant skill directory under `tools/<toolset>/skills/`.
78+
2. Add a `get_*_skill()` convenience loader.
79+
3. Users add both the toolset and `SkillToolset` to their agent's tools.
Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
# Copyright 2026 Google LLC
2+
#
3+
# Licensed under the Apache License, Version 2.0 (the "License");
4+
# you may not use this file except in compliance with the License.
5+
# You may obtain a copy of the License at
6+
#
7+
# http://www.apache.org/licenses/LICENSE-2.0
8+
#
9+
# Unless required by applicable law or agreed to in writing, software
10+
# distributed under the License is distributed on an "AS IS" BASIS,
11+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12+
# See the License for the specific language governing permissions and
13+
# limitations under the License.
14+
15+
from . import agent

0 commit comments

Comments
 (0)