Skip to content

Commit 86e5c52

Browse files
authored
Add teradata-mcp-customisation skill to agentic/skills/ (#328)
1 parent 430f5cd commit 86e5c52

12 files changed

Lines changed: 1011 additions & 0 deletions

File tree

agentic/.claude-plugin/plugin.json

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
{
2+
"name": "teradata-mcp-customisation",
3+
"description": "Build, edit, and debug a semantic layer for the Teradata MCP server — custom tools, cubes, prompts, and glossary entries declared in YAML, plus the profiles.yml that exposes them.",
4+
"version": "1.0.0"
5+
}
Lines changed: 69 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,69 @@
1+
---
2+
name: teradata-mcp-customisation
3+
description: Use when the task is to build, edit, or debug a semantic layer for the Teradata MCP server (https://github.com/Teradata/teradata-mcp-server) — custom tools, cubes, prompts, and glossary entries declared in YAML, plus the profiles.yml that exposes them. Covers schema rules, the cube SQL wrapping model, parameter substitution, profile regex selection, and the runtime smoke-test flow.
4+
---
5+
6+
You are extending the Teradata MCP server with a **semantic layer**: domain-specific tools, cubes, prompts, and glossary terms declared in YAML files that the server picks up at startup.
7+
8+
The server is the upstream project at https://github.com/Teradata/teradata-mcp-server. The customisation surface is documented in `docs/server_guide/CUSTOMIZING.md`; this skill compiles the parts that are easy to get wrong, plus working examples that anyone can run against the `DBC` system database.
9+
10+
## When to use this skill
11+
12+
- Designing a new semantic layer for a Teradata data product (cubes + curated prompts).
13+
- Adding or editing a custom tool / cube / prompt / glossary entry.
14+
- Building a `profiles.yml` to expose a subset of objects to a given client.
15+
- Debugging why a custom object does not appear in `tools/list`, `prompts/list`, or `resources/list`.
16+
- Migrating a hand-written SQL pattern into a reusable cube the LLM can compose.
17+
18+
## Mental model — what each object becomes at runtime
19+
20+
| YAML `type:` | Where it surfaces | Selected by profile section |
21+
|---|---|---|
22+
| `tool` | MCP **tool** — parameterised SQL the LLM calls | `tool:` |
23+
| `cube` | MCP **tool** — the server auto-generates a 6-arg aggregator (dimensions, measures, dim_filters, meas_filters, order_by, top) plus any custom params you declare | `tool:` |
24+
| `prompt` | MCP **prompt** — a reusable system / user prompt the client can fetch by name | `prompt:` |
25+
| `glossary` | MCP **resource** — domain terms surfaced as context resources, enriched with cube measure/dim descriptions | `resource:` |
26+
27+
There is no `type: resource`. Resources are derived: glossary entries become resources, and cube/tool descriptions feed back into glossary enrichment.
28+
29+
## Workflow
30+
31+
1. **Locate the config directory.** The server reads YAML from `--config_dir` (CLI) or `$CONFIG_DIR` (env), defaulting to the current working directory. Drop one or more `*.yml` files there alongside `profiles.yml`. Multiple files merge into one namespace keyed by object name.
32+
2. **Pick the object type** for what you are building. If in doubt, see `reference/object-types.md`.
33+
3. **Write the SQL first, in isolation.** Run it in your usual Teradata client until it returns what you want — *then* wrap it as a tool or cube. For cubes, write the flat denormalised base SQL; let the server build the aggregator on top. See `reference/cube-mechanics.md` for exactly how the wrapping works (which is critical for understanding dim_filters vs meas_filters semantics).
34+
4. **Get parameter substitution right.** Custom tools support two styles: `:param` for value binds, `{param}` for identifier interpolation (database / table names). Cubes also accept custom parameters used inside the base SQL the same way. See `reference/parameter-substitution.md`.
35+
5. **Expose via `profiles.yml`.** Add a profile (or extend an existing one) with regex selectors. See `reference/profiles.md` for the layering rules and the built-in profiles you can inherit from.
36+
6. **Smoke-test the server** with the MCP listing flow before pointing a client at it. See `reference/deployment.md` for the bash curl recipe (`initialize``notifications/initialized``tools/list`).
37+
38+
## Authoring rules (important — don't skip)
39+
40+
1. **Descriptions are the contract.** The `description` fields on tools, cubes, dimensions, measures, and parameters are what the LLM reads to choose and shape its calls. Write them like terse API docs: what it represents, units, when to use it, when *not* to. The server appends type info automatically — don't repeat it.
41+
2. **Cubes are filtered twice.** `dim_filters` apply to the flat base SQL before aggregation (use raw column names from the base SELECT). `meas_filters` apply after `GROUP BY` (use measure names — `nii > 1000` not `SUM(nii_v) > 1000`). Get this wrong and the LLM will write filters that error or return nothing.
42+
3. **Pin the domain in the base SQL `WHERE`.** Hard-code the filters that define what the cube *is* (one product family, one subject area). Do not leave that decision to `dim_filters` — the LLM will forget or invent.
43+
4. **Keep the menu small.** Aim for ~10 dimensions and ~15 measures per cube. More than that and selection accuracy drops sharply. Split into two cubes if you have a wider surface.
44+
5. **Aggregate measure expressions are first-class.** A measure `expression:` is whatever Teradata SQL evaluates to a scalar over the group — `SUM(...)`, ratios with `NULLIFZERO`, even `SUM(x) OVER ()` for "share of total" measures. The server inlines it as `expression AS measure_name`.
45+
6. **Never embed credentials** in cube/tool SQL or in `profiles.yml`. Connection strings belong in `profiles.yml`'s `run:` block via env-var substitution (`${TD_USER}`, `${TD_PASSWORD}`), or in the server's own startup env.
46+
7. **Name with a domain prefix.** `sales_growth_cube`, `dba_space_cube`. A single regex (`sales_.*`) then selects the whole pack into a profile.
47+
8. **Prompts are not auto-attached.** A `type: prompt` is a callable resource the client *fetches by name* — it is not silently injected into every conversation. Document that in your prompt's `description`.
48+
49+
## Progressive references
50+
51+
Load these only when needed:
52+
53+
- `reference/object-types.md` — full YAML schemas for `tool`, `cube`, `prompt`, `glossary` with every field and what each does.
54+
- `reference/cube-mechanics.md` — the exact SELECT the server generates around your base SQL, plus the auto-added cube parameters (`dimensions`, `measures`, `dim_filters`, `meas_filters`, `order_by`, `top`).
55+
- `reference/parameter-substitution.md``:name` value binds vs `{name}` identifier formatting, the synthetic `{table_ref}` key, supported `type_hint` values, and the auto-added `persist` parameter on custom tools.
56+
- `reference/profiles.md` — how `profiles.yml` layers over the packaged profiles, regex selector rules, the `run:` block (transport / port / database URI), and the built-in profiles you can extend (`all`, `eda`, `dba`, `dataScientist`, …).
57+
- `reference/deployment.md``config_dir` resolution, server launch, and the bare-bones curl smoke-test that walks the streamable-HTTP MCP handshake to list tools / prompts / resources.
58+
59+
## Worked examples (DBC-only, portable)
60+
61+
All examples target `DBC` system views so anyone with Teradata access can run them without setting up sample data:
62+
63+
- `examples/example_tool.yml` — a parameterised tool that lists the N largest tables in a given database (`DBC.AllSpaceV`).
64+
- `examples/example_cube.yml` — a "database space" cube on `DBC.AllSpaceV` with dimensions (database, account, owner) and measures (current_perm, peak_perm, skew_factor).
65+
- `examples/example_prompt.yml` — a Teradata DBA persona prompt with a `{focus_area}` parameter.
66+
- `examples/example_glossary.yml` — glossary entries for `permspace`, `spool`, `skew factor`.
67+
- `examples/example_profiles.yml` — a `dbc_demo` profile that selects the four objects above by regex, plus inherited read-only base tools.
68+
69+
Copy any of these into a `*.yml` file in your config dir, restart the server, and they will appear in the relevant MCP list.
Lines changed: 95 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,95 @@
1+
# ============================================================================
2+
# Example cube — "database space" semantic layer over DBC.AllSpaceV
3+
# ============================================================================
4+
#
5+
# Demonstrates:
6+
# - flat denormalised base SQL with public-named alias for dimensions and
7+
# _v-suffixed alias for measure source columns
8+
# - domain pinned in base WHERE (TableName = 'All' for db-level rows,
9+
# exclude system schemas the user can't manage)
10+
# - aggregate measure expressions, ratio with NULLIFZERO, share-of-total
11+
# via SUM(...) OVER ()
12+
# - no custom parameters — the auto-added dim_filters / meas_filters /
13+
# order_by / top is enough for most ad-hoc questions
14+
# ============================================================================
15+
16+
dbc_space_cube:
17+
type: cube
18+
description: |
19+
Database-level storage cube. One row per (database, owner, account) with
20+
measures derived from DBC.AllSpaceV's database-grain rows (TableName =
21+
'All'). Use this for "who is using the space", "which databases are
22+
near their cap", "share of total perm" questions. For per-table
23+
breakdown within one database, use dbc_top_tables_by_size instead.
24+
25+
Built-in scope: excludes the DBC, TDWM, and SystemFe administrative
26+
schemas, which would otherwise dominate every result.
27+
sql: |
28+
SELECT
29+
s.DataBaseName AS database_name,
30+
d.OwnerName AS owner_name,
31+
d.AccountName AS account_name,
32+
s.CurrentPerm AS current_perm_v,
33+
s.PeakPerm AS peak_perm_v,
34+
s.MaxPerm AS max_perm_v,
35+
s.CurrentSpool AS current_spool_v,
36+
s.PeakSpool AS peak_spool_v,
37+
s.MaxSpool AS max_spool_v
38+
FROM DBC.AllSpaceV s
39+
JOIN DBC.DatabasesV d ON d.DatabaseName = s.DataBaseName
40+
WHERE s.TableName = 'All'
41+
AND s.DataBaseName NOT IN ('DBC', 'TDWM', 'SystemFe')
42+
dimensions:
43+
database_name:
44+
description: "Database / schema name (DBC.AllSpaceV.DataBaseName)."
45+
expression: database_name
46+
owner_name:
47+
description: "Database owner / creator (DBC.DatabasesV.OwnerName)."
48+
expression: owner_name
49+
account_name:
50+
description: "Account string used for workload accounting."
51+
expression: account_name
52+
measures:
53+
current_perm_bytes:
54+
description: "Current permanent space in use across all AMPs, bytes."
55+
expression: "SUM(current_perm_v)"
56+
current_perm_gb:
57+
description: "Current permanent space in use, GiB."
58+
expression: "CAST(SUM(current_perm_v) / 1073741824.0 AS DECIMAL(18,2))"
59+
peak_perm_bytes:
60+
description: "Peak permanent space observed since last reset, bytes."
61+
expression: "SUM(peak_perm_v)"
62+
max_perm_bytes:
63+
description: "Maximum permanent space granted to the database, bytes."
64+
expression: "SUM(max_perm_v)"
65+
perm_utilisation_pct:
66+
description: |
67+
Current perm as % of granted max. NULL when max=0 (uncapped
68+
databases). Above ~85% indicates a database that should be
69+
reviewed.
70+
expression: "CAST(SUM(current_perm_v) * 100.0 / NULLIFZERO(SUM(max_perm_v)) AS DECIMAL(10,2))"
71+
spool_peak_bytes:
72+
description: "Peak spool consumed by users charged to this database, bytes."
73+
expression: "SUM(peak_spool_v)"
74+
share_of_total_perm_pct:
75+
description: |
76+
This group's share of total CurrentPerm across the result set, %.
77+
Useful for ranking which database, owner, or account is using
78+
most of the space in scope.
79+
expression: "CAST(SUM(current_perm_v) * 100.0 / NULLIFZERO(SUM(SUM(current_perm_v)) OVER ()) AS DECIMAL(10,2))"
80+
81+
# ----------------------------------------------------------------------------
82+
# Example calls (what the LLM will generate against the auto-built tool):
83+
#
84+
# dimensions = "database_name"
85+
# measures = "current_perm_gb, perm_utilisation_pct, share_of_total_perm_pct"
86+
# dim_filters = "owner_name = 'DBADMIN'"
87+
# meas_filters = "perm_utilisation_pct > 80"
88+
# order_by = "current_perm_gb DESC"
89+
# top = 20
90+
#
91+
# Note:
92+
# - dim_filters references owner_name (a column emitted by base SQL).
93+
# - meas_filters references perm_utilisation_pct (a measure ALIAS), not
94+
# its SUM() expression — see reference/cube-mechanics.md §2.
95+
# ----------------------------------------------------------------------------
Lines changed: 55 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,55 @@
1+
# ============================================================================
2+
# Example glossary — Teradata storage / workload terms
3+
# ============================================================================
4+
#
5+
# Demonstrates:
6+
# - one glossary object defining many terms
7+
# - synonyms list (optional)
8+
# - multi-line definitions via YAML block scalar
9+
#
10+
# Reminder: glossary objects are selected by the profile's `resource:`
11+
# pattern list, NOT `tool:`. They surface as MCP resources, not tools.
12+
# ============================================================================
13+
14+
dbc_glossary:
15+
type: glossary
16+
17+
permspace:
18+
definition: |
19+
Permanent table storage granted to a database. Measured in bytes per
20+
AMP, summed across AMPs in DBC.AllSpaceV. A database's CurrentPerm
21+
grows as users insert into its tables; PeakPerm tracks the high-water
22+
mark since the counters were last reset.
23+
synonyms:
24+
- perm
25+
- permanent space
26+
- currentperm
27+
28+
spool:
29+
definition: |
30+
Temporary intermediate storage used by Teradata query execution. Each
31+
user / account has a spool quota; running out aborts the query with
32+
a 2646 error. Spool is charged to the requesting user, not the
33+
database where the data lives.
34+
synonyms:
35+
- spool space
36+
- sortspool
37+
38+
skew_factor:
39+
definition: |
40+
Data distribution skew across AMPs, computed as
41+
(max_amp_size - avg_amp_size) / max_amp_size * 100. Above ~30%
42+
usually indicates a poor primary-index choice that should be
43+
reviewed; above ~60% will cause out-of-spool errors on joins.
44+
synonyms:
45+
- skew
46+
- amp skew
47+
48+
account_string:
49+
definition: |
50+
A short identifier (DBC.DatabasesV.AccountName) used to group
51+
sessions for workload management and accounting. Multiple users can
52+
share an account string; charge-back reports aggregate by account.
53+
synonyms:
54+
- account
55+
- account name
Lines changed: 46 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,46 @@
1+
# ============================================================================
2+
# Example profiles.yml — exposes the dbc_* example objects, plus a "super"
3+
# profile that adds read-only base tools and qlty for free-form exploration.
4+
# ============================================================================
5+
#
6+
# Demonstrates:
7+
# - one profile per audience, named after the persona / use case
8+
# - regex selectors with anchored prefix matches
9+
# - reuse of the upstream "eda" tool pattern (no writeQuery / dynamicQuery)
10+
# - run: block to bind a profile to a specific transport / port
11+
# - env-var substitution in the database URI so credentials stay out of
12+
# this file
13+
# ============================================================================
14+
15+
# Curated business profile — only the DBC semantic layer. Suitable for an
16+
# end-user chat where you want the LLM to call cubes, not write SQL.
17+
dbc_demo:
18+
tool:
19+
- ^dbc_top_tables_by_size$
20+
- ^dbc_space_cube$
21+
prompt:
22+
- ^dbc_dba_prompt$
23+
resource:
24+
- ^dbc_glossary$
25+
run:
26+
database_uri: "teradata://${TD_USER}:${TD_PASSWORD}@${TD_HOST}:1025"
27+
mcp_transport: "streamable-http"
28+
mcp_port: 8010
29+
30+
# Super profile — semantic layer + read-only base tools + data-quality
31+
# tools. Suitable for analysts who want to free-form explore alongside the
32+
# curated cubes.
33+
dbc_demo_super:
34+
tool:
35+
- ^dbc_.*
36+
- "base_(?!(writeQuery|dynamicQuery)$).*"
37+
- qlty_.*
38+
- ^sec_userDbPermissions$
39+
prompt:
40+
- ^dbc_.*
41+
resource:
42+
- ^dbc_.*
43+
run:
44+
database_uri: "teradata://${TD_USER}:${TD_PASSWORD}@${TD_HOST}:1025"
45+
mcp_transport: "streamable-http"
46+
mcp_port: 8011
Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,45 @@
1+
# ============================================================================
2+
# Example prompt — a DBA persona with a focus_area parameter
3+
# ============================================================================
4+
#
5+
# Demonstrates:
6+
# - parameterised prompt with {placeholder} substitution
7+
# - required parameter via `required: true` (prompt-specific field)
8+
# - description that explicitly tells callers what tools the prompt is
9+
# designed to be used alongside
10+
# ============================================================================
11+
12+
dbc_dba_prompt:
13+
type: prompt
14+
description: |
15+
Teradata DBA persona, scoped to a chosen problem area. Designed to be
16+
used with the dbc_top_tables_by_size tool and the dbc_space_cube. The
17+
client must fetch this prompt via prompts/get with the focus_area
18+
argument set before sending it as the system message.
19+
parameters:
20+
focus_area:
21+
description: |
22+
One of 'space', 'workload', or 'security' — controls which kind
23+
of question the assistant will treat as in-scope.
24+
type_hint: str
25+
required: true
26+
prompt: |
27+
You are a Teradata DBA assistant. Your scope is **{focus_area}**.
28+
29+
Behaviour:
30+
- For in-scope questions, plan before each tool call (one short line),
31+
then call the appropriate tool. Reflect on the result before
32+
answering.
33+
- If the user asks something outside {focus_area}, redirect politely
34+
and offer to re-scope.
35+
- All sizes in your final answer should be reported in GiB rounded to
36+
two decimals. All percentages to one decimal.
37+
- Never expose internal column names with the _v suffix or raw SQL.
38+
39+
Available data:
40+
- dbc_space_cube — database / owner / account aggregations of
41+
DBC.AllSpaceV. Use this for ranking and "share of total" questions.
42+
- dbc_top_tables_by_size — per-table breakdown within one database.
43+
Use this when the user names a specific database.
44+
45+
Glossary terms you may rely on: permspace, spool, skew_factor.
Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,41 @@
1+
# ============================================================================
2+
# Example custom tool — uses DBC system views so it works on any Teradata
3+
# system without setup. Lists the N largest tables in a given database by
4+
# current permanent space.
5+
# ============================================================================
6+
#
7+
# Demonstrates:
8+
# - identifier interpolation via {database_name}
9+
# - value bind via :n
10+
# - parameter defaults (both args optional)
11+
# - description fields that surface to the LLM
12+
# ============================================================================
13+
14+
dbc_top_tables_by_size:
15+
type: tool
16+
description: |
17+
List the top-N largest tables in a Teradata database, ranked by current
18+
permanent space (bytes). Reads DBC.AllSpaceV. Use this for quick "what
19+
is filling up this schema" questions; for trend / peak analysis use the
20+
dbc_space_cube instead.
21+
parameters:
22+
database_name:
23+
description: "Database / schema to inspect. Default DBC (system catalog)."
24+
type_hint: str
25+
default: "DBC"
26+
n:
27+
description: "How many tables to return (default 10)."
28+
type_hint: int
29+
default: 10
30+
sql: |
31+
SELECT TOP :n
32+
TableName AS table_name,
33+
SUM(CurrentPerm) AS current_perm_bytes,
34+
SUM(PeakPerm) AS peak_perm_bytes,
35+
CAST(SUM(CurrentPerm) / 1024.0 / 1024.0
36+
AS DECIMAL(18,2)) AS current_perm_mb
37+
FROM DBC.AllSpaceV
38+
WHERE DataBaseName = '{database_name}'
39+
AND TableName <> 'All'
40+
GROUP BY TableName
41+
ORDER BY current_perm_bytes DESC

0 commit comments

Comments
 (0)