Teradata
diff --git a/‎agentic/.claude-plugin/plugin.json‎
Lines changed: 5 additions & 0 deletions b/‎agentic/.claude-plugin/plugin.json‎
Lines changed: 5 additions & 0 deletions
diff --git a/‎agentic/skills/teradata-mcp-customisation/SKILL.md‎
Lines changed: 69 additions & 0 deletions b/‎agentic/skills/teradata-mcp-customisation/SKILL.md‎
Lines changed: 69 additions & 0 deletions
diff --git a/‎agentic/skills/teradata-mcp-customisation/examples/example_cube.yml‎
Lines changed: 95 additions & 0 deletions b/‎agentic/skills/teradata-mcp-customisation/examples/example_cube.yml‎
Lines changed: 95 additions & 0 deletions
diff --git a/‎agentic/skills/teradata-mcp-customisation/examples/example_glossary.yml‎
Lines changed: 55 additions & 0 deletions b/‎agentic/skills/teradata-mcp-customisation/examples/example_glossary.yml‎
Lines changed: 55 additions & 0 deletions
diff --git a/‎agentic/skills/teradata-mcp-customisation/examples/example_profiles.yml‎
Lines changed: 46 additions & 0 deletions b/‎agentic/skills/teradata-mcp-customisation/examples/example_profiles.yml‎
Lines changed: 46 additions & 0 deletions
diff --git a/‎agentic/skills/teradata-mcp-customisation/examples/example_prompt.yml‎
Lines changed: 45 additions & 0 deletions b/‎agentic/skills/teradata-mcp-customisation/examples/example_prompt.yml‎
Lines changed: 45 additions & 0 deletions
diff --git a/‎agentic/skills/teradata-mcp-customisation/examples/example_tool.yml‎
Lines changed: 41 additions & 0 deletions b/‎agentic/skills/teradata-mcp-customisation/examples/example_tool.yml‎
Lines changed: 41 additions & 0 deletions
@@ -0,0 +1,5 @@
+{
+  "name": "teradata-mcp-customisation",
+  "description": "Build, edit, and debug a semantic layer for the Teradata MCP server — custom tools, cubes, prompts, and glossary entries declared in YAML, plus the profiles.yml that exposes them.",
+  "version": "1.0.0"
+}
@@ -0,0 +1,69 @@
+---
+name: teradata-mcp-customisation
+description: Use when the task is to build, edit, or debug a semantic layer for the Teradata MCP server (https://github.com/Teradata/teradata-mcp-server) — custom tools, cubes, prompts, and glossary entries declared in YAML, plus the profiles.yml that exposes them. Covers schema rules, the cube SQL wrapping model, parameter substitution, profile regex selection, and the runtime smoke-test flow.
+---
+
+You are extending the Teradata MCP server with a **semantic layer**: domain-specific tools, cubes, prompts, and glossary terms declared in YAML files that the server picks up at startup.
+
+The server is the upstream project at https://github.com/Teradata/teradata-mcp-server. The customisation surface is documented in `docs/server_guide/CUSTOMIZING.md`; this skill compiles the parts that are easy to get wrong, plus working examples that anyone can run against the `DBC` system database.
+
+## When to use this skill
+
+- Designing a new semantic layer for a Teradata data product (cubes + curated prompts).
+- Adding or editing a custom tool / cube / prompt / glossary entry.
+- Building a `profiles.yml` to expose a subset of objects to a given client.
+- Debugging why a custom object does not appear in `tools/list`, `prompts/list`, or `resources/list`.
+- Migrating a hand-written SQL pattern into a reusable cube the LLM can compose.
+
+## Mental model — what each object becomes at runtime
+
+| YAML `type:` | Where it surfaces | Selected by profile section |
+|---|---|---|
+| `tool` | MCP **tool** — parameterised SQL the LLM calls | `tool:` |
+| `cube` | MCP **tool** — the server auto-generates a 6-arg aggregator (dimensions, measures, dim_filters, meas_filters, order_by, top) plus any custom params you declare | `tool:` |
+| `prompt` | MCP **prompt** — a reusable system / user prompt the client can fetch by name | `prompt:` |
+| `glossary` | MCP **resource** — domain terms surfaced as context resources, enriched with cube measure/dim descriptions | `resource:` |
+
+There is no `type: resource`. Resources are derived: glossary entries become resources, and cube/tool descriptions feed back into glossary enrichment.
+
+## Workflow
+
+1. **Locate the config directory.** The server reads YAML from `--config_dir` (CLI) or `$CONFIG_DIR` (env), defaulting to the current working directory. Drop one or more `*.yml` files there alongside `profiles.yml`. Multiple files merge into one namespace keyed by object name.
+2. **Pick the object type** for what you are building. If in doubt, see `reference/object-types.md`.
+3. **Write the SQL first, in isolation.** Run it in your usual Teradata client until it returns what you want — *then* wrap it as a tool or cube. For cubes, write the flat denormalised base SQL; let the server build the aggregator on top. See `reference/cube-mechanics.md` for exactly how the wrapping works (which is critical for understanding dim_filters vs meas_filters semantics).
+4. **Get parameter substitution right.** Custom tools support two styles: `:param` for value binds, `{param}` for identifier interpolation (database / table names). Cubes also accept custom parameters used inside the base SQL the same way. See `reference/parameter-substitution.md`.
+5. **Expose via `profiles.yml`.** Add a profile (or extend an existing one) with regex selectors. See `reference/profiles.md` for the layering rules and the built-in profiles you can inherit from.
+6. **Smoke-test the server** with the MCP listing flow before pointing a client at it. See `reference/deployment.md` for the bash curl recipe (`initialize` → `notifications/initialized` → `tools/list`).
+
+## Authoring rules (important — don't skip)
+
+1. **Descriptions are the contract.** The `description` fields on tools, cubes, dimensions, measures, and parameters are what the LLM reads to choose and shape its calls. Write them like terse API docs: what it represents, units, when to use it, when *not* to. The server appends type info automatically — don't repeat it.
+2. **Cubes are filtered twice.** `dim_filters` apply to the flat base SQL before aggregation (use raw column names from the base SELECT). `meas_filters` apply after `GROUP BY` (use measure names — `nii > 1000` not `SUM(nii_v) > 1000`). Get this wrong and the LLM will write filters that error or return nothing.
+3. **Pin the domain in the base SQL `WHERE`.** Hard-code the filters that define what the cube *is* (one product family, one subject area). Do not leave that decision to `dim_filters` — the LLM will forget or invent.
+4. **Keep the menu small.** Aim for ~10 dimensions and ~15 measures per cube. More than that and selection accuracy drops sharply. Split into two cubes if you have a wider surface.
+5. **Aggregate measure expressions are first-class.** A measure `expression:` is whatever Teradata SQL evaluates to a scalar over the group — `SUM(...)`, ratios with `NULLIFZERO`, even `SUM(x) OVER ()` for "share of total" measures. The server inlines it as `expression AS measure_name`.
+6. **Never embed credentials** in cube/tool SQL or in `profiles.yml`. Connection strings belong in `profiles.yml`'s `run:` block via env-var substitution (`${TD_USER}`, `${TD_PASSWORD}`), or in the server's own startup env.
+7. **Name with a domain prefix.** `sales_growth_cube`, `dba_space_cube`. A single regex (`sales_.*`) then selects the whole pack into a profile.
+8. **Prompts are not auto-attached.** A `type: prompt` is a callable resource the client *fetches by name* — it is not silently injected into every conversation. Document that in your prompt's `description`.
+
+## Progressive references
+
+Load these only when needed:
+
+- `reference/object-types.md` — full YAML schemas for `tool`, `cube`, `prompt`, `glossary` with every field and what each does.
+- `reference/cube-mechanics.md` — the exact SELECT the server generates around your base SQL, plus the auto-added cube parameters (`dimensions`, `measures`, `dim_filters`, `meas_filters`, `order_by`, `top`).
+- `reference/parameter-substitution.md` — `:name` value binds vs `{name}` identifier formatting, the synthetic `{table_ref}` key, supported `type_hint` values, and the auto-added `persist` parameter on custom tools.
+- `reference/profiles.md` — how `profiles.yml` layers over the packaged profiles, regex selector rules, the `run:` block (transport / port / database URI), and the built-in profiles you can extend (`all`, `eda`, `dba`, `dataScientist`, …).
+- `reference/deployment.md` — `config_dir` resolution, server launch, and the bare-bones curl smoke-test that walks the streamable-HTTP MCP handshake to list tools / prompts / resources.
+
+## Worked examples (DBC-only, portable)
+
+All examples target `DBC` system views so anyone with Teradata access can run them without setting up sample data:
+
+- `examples/example_tool.yml` — a parameterised tool that lists the N largest tables in a given database (`DBC.AllSpaceV`).
+- `examples/example_cube.yml` — a "database space" cube on `DBC.AllSpaceV` with dimensions (database, account, owner) and measures (current_perm, peak_perm, skew_factor).
+- `examples/example_prompt.yml` — a Teradata DBA persona prompt with a `{focus_area}` parameter.
+- `examples/example_glossary.yml` — glossary entries for `permspace`, `spool`, `skew factor`.
+- `examples/example_profiles.yml` — a `dbc_demo` profile that selects the four objects above by regex, plus inherited read-only base tools.
+
+Copy any of these into a `*.yml` file in your config dir, restart the server, and they will appear in the relevant MCP list.
@@ -0,0 +1,95 @@
+# ============================================================================
+# Example cube — "database space" semantic layer over DBC.AllSpaceV
+# ============================================================================
+#
+# Demonstrates:
+#   - flat denormalised base SQL with public-named alias for dimensions and
+#     _v-suffixed alias for measure source columns
+#   - domain pinned in base WHERE (TableName = 'All' for db-level rows,
+#     exclude system schemas the user can't manage)
+#   - aggregate measure expressions, ratio with NULLIFZERO, share-of-total
+#     via SUM(...) OVER ()
+#   - no custom parameters — the auto-added dim_filters / meas_filters /
+#     order_by / top is enough for most ad-hoc questions
+# ============================================================================
+
+dbc_space_cube:
+  type: cube
+  description: |
+    Database-level storage cube. One row per (database, owner, account) with
+    measures derived from DBC.AllSpaceV's database-grain rows (TableName =
+    'All'). Use this for "who is using the space", "which databases are
+    near their cap", "share of total perm" questions. For per-table
+    breakdown within one database, use dbc_top_tables_by_size instead.
+
+    Built-in scope: excludes the DBC, TDWM, and SystemFe administrative
+    schemas, which would otherwise dominate every result.
+  sql: |
+    SELECT
+      s.DataBaseName       AS database_name,
+      d.OwnerName          AS owner_name,
+      d.AccountName        AS account_name,
+      s.CurrentPerm        AS current_perm_v,
+      s.PeakPerm           AS peak_perm_v,
+      s.MaxPerm            AS max_perm_v,
+      s.CurrentSpool       AS current_spool_v,
+      s.PeakSpool          AS peak_spool_v,
+      s.MaxSpool           AS max_spool_v
+    FROM DBC.AllSpaceV s
+    JOIN DBC.DatabasesV d ON d.DatabaseName = s.DataBaseName
+    WHERE s.TableName = 'All'
+      AND s.DataBaseName NOT IN ('DBC', 'TDWM', 'SystemFe')
+  dimensions:
+    database_name:
+      description: "Database / schema name (DBC.AllSpaceV.DataBaseName)."
+      expression: database_name
+    owner_name:
+      description: "Database owner / creator (DBC.DatabasesV.OwnerName)."
+      expression: owner_name
+    account_name:
+      description: "Account string used for workload accounting."
+      expression: account_name
+  measures:
+    current_perm_bytes:
+      description: "Current permanent space in use across all AMPs, bytes."
+      expression: "SUM(current_perm_v)"
+    current_perm_gb:
+      description: "Current permanent space in use, GiB."
+      expression: "CAST(SUM(current_perm_v) / 1073741824.0 AS DECIMAL(18,2))"
+    peak_perm_bytes:
+      description: "Peak permanent space observed since last reset, bytes."
+      expression: "SUM(peak_perm_v)"
+    max_perm_bytes:
+      description: "Maximum permanent space granted to the database, bytes."
+      expression: "SUM(max_perm_v)"
+    perm_utilisation_pct:
+      description: |
+        Current perm as % of granted max. NULL when max=0 (uncapped
+        databases). Above ~85% indicates a database that should be
+        reviewed.
+      expression: "CAST(SUM(current_perm_v) * 100.0 / NULLIFZERO(SUM(max_perm_v)) AS DECIMAL(10,2))"
+    spool_peak_bytes:
+      description: "Peak spool consumed by users charged to this database, bytes."
+      expression: "SUM(peak_spool_v)"
+    share_of_total_perm_pct:
+      description: |
+        This group's share of total CurrentPerm across the result set, %.
+        Useful for ranking which database, owner, or account is using
+        most of the space in scope.
+      expression: "CAST(SUM(current_perm_v) * 100.0 / NULLIFZERO(SUM(SUM(current_perm_v)) OVER ()) AS DECIMAL(10,2))"
+
+# ----------------------------------------------------------------------------
+# Example calls (what the LLM will generate against the auto-built tool):
+#
+#   dimensions    = "database_name"
+#   measures      = "current_perm_gb, perm_utilisation_pct, share_of_total_perm_pct"
+#   dim_filters   = "owner_name = 'DBADMIN'"
+#   meas_filters  = "perm_utilisation_pct > 80"
+#   order_by      = "current_perm_gb DESC"
+#   top           = 20
+#
+# Note:
+#   - dim_filters references owner_name (a column emitted by base SQL).
+#   - meas_filters references perm_utilisation_pct (a measure ALIAS), not
+#     its SUM() expression — see reference/cube-mechanics.md §2.
+# ----------------------------------------------------------------------------
@@ -0,0 +1,55 @@
+# ============================================================================
+# Example glossary — Teradata storage / workload terms
+# ============================================================================
+#
+# Demonstrates:
+#   - one glossary object defining many terms
+#   - synonyms list (optional)
+#   - multi-line definitions via YAML block scalar
+#
+# Reminder: glossary objects are selected by the profile's `resource:`
+# pattern list, NOT `tool:`. They surface as MCP resources, not tools.
+# ============================================================================
+
+dbc_glossary:
+  type: glossary
+
+  permspace:
+    definition: |
+      Permanent table storage granted to a database. Measured in bytes per
+      AMP, summed across AMPs in DBC.AllSpaceV. A database's CurrentPerm
+      grows as users insert into its tables; PeakPerm tracks the high-water
+      mark since the counters were last reset.
+    synonyms:
+      - perm
+      - permanent space
+      - currentperm
+
+  spool:
+    definition: |
+      Temporary intermediate storage used by Teradata query execution. Each
+      user / account has a spool quota; running out aborts the query with
+      a 2646 error. Spool is charged to the requesting user, not the
+      database where the data lives.
+    synonyms:
+      - spool space
+      - sortspool
+
+  skew_factor:
+    definition: |
+      Data distribution skew across AMPs, computed as
+      (max_amp_size - avg_amp_size) / max_amp_size * 100. Above ~30%
+      usually indicates a poor primary-index choice that should be
+      reviewed; above ~60% will cause out-of-spool errors on joins.
+    synonyms:
+      - skew
+      - amp skew
+
+  account_string:
+    definition: |
+      A short identifier (DBC.DatabasesV.AccountName) used to group
+      sessions for workload management and accounting. Multiple users can
+      share an account string; charge-back reports aggregate by account.
+    synonyms:
+      - account
+      - account name
@@ -0,0 +1,46 @@
+# ============================================================================
+# Example profiles.yml — exposes the dbc_* example objects, plus a "super"
+# profile that adds read-only base tools and qlty for free-form exploration.
+# ============================================================================
+#
+# Demonstrates:
+#   - one profile per audience, named after the persona / use case
+#   - regex selectors with anchored prefix matches
+#   - reuse of the upstream "eda" tool pattern (no writeQuery / dynamicQuery)
+#   - run: block to bind a profile to a specific transport / port
+#   - env-var substitution in the database URI so credentials stay out of
+#     this file
+# ============================================================================
+
+# Curated business profile — only the DBC semantic layer. Suitable for an
+# end-user chat where you want the LLM to call cubes, not write SQL.
+dbc_demo:
+  tool:
+    - ^dbc_top_tables_by_size$
+    - ^dbc_space_cube$
+  prompt:
+    - ^dbc_dba_prompt$
+  resource:
+    - ^dbc_glossary$
+  run:
+    database_uri: "teradata://${TD_USER}:${TD_PASSWORD}@${TD_HOST}:1025"
+    mcp_transport: "streamable-http"
+    mcp_port: 8010
+
+# Super profile — semantic layer + read-only base tools + data-quality
+# tools. Suitable for analysts who want to free-form explore alongside the
+# curated cubes.
+dbc_demo_super:
+  tool:
+    - ^dbc_.*
+    - "base_(?!(writeQuery|dynamicQuery)$).*"
+    - qlty_.*
+    - ^sec_userDbPermissions$
+  prompt:
+    - ^dbc_.*
+  resource:
+    - ^dbc_.*
+  run:
+    database_uri: "teradata://${TD_USER}:${TD_PASSWORD}@${TD_HOST}:1025"
+    mcp_transport: "streamable-http"
+    mcp_port: 8011
@@ -0,0 +1,45 @@
+# ============================================================================
+# Example prompt — a DBA persona with a focus_area parameter
+# ============================================================================
+#
+# Demonstrates:
+#   - parameterised prompt with {placeholder} substitution
+#   - required parameter via `required: true` (prompt-specific field)
+#   - description that explicitly tells callers what tools the prompt is
+#     designed to be used alongside
+# ============================================================================
+
+dbc_dba_prompt:
+  type: prompt
+  description: |
+    Teradata DBA persona, scoped to a chosen problem area. Designed to be
+    used with the dbc_top_tables_by_size tool and the dbc_space_cube. The
+    client must fetch this prompt via prompts/get with the focus_area
+    argument set before sending it as the system message.
+  parameters:
+    focus_area:
+      description: |
+        One of 'space', 'workload', or 'security' — controls which kind
+        of question the assistant will treat as in-scope.
+      type_hint: str
+      required: true
+  prompt: |
+    You are a Teradata DBA assistant. Your scope is **{focus_area}**.
+
+    Behaviour:
+    - For in-scope questions, plan before each tool call (one short line),
+      then call the appropriate tool. Reflect on the result before
+      answering.
+    - If the user asks something outside {focus_area}, redirect politely
+      and offer to re-scope.
+    - All sizes in your final answer should be reported in GiB rounded to
+      two decimals. All percentages to one decimal.
+    - Never expose internal column names with the _v suffix or raw SQL.
+
+    Available data:
+    - dbc_space_cube — database / owner / account aggregations of
+      DBC.AllSpaceV. Use this for ranking and "share of total" questions.
+    - dbc_top_tables_by_size — per-table breakdown within one database.
+      Use this when the user names a specific database.
+
+    Glossary terms you may rely on: permspace, spool, skew_factor.
@@ -0,0 +1,41 @@
+# ============================================================================
+# Example custom tool — uses DBC system views so it works on any Teradata
+# system without setup. Lists the N largest tables in a given database by
+# current permanent space.
+# ============================================================================
+#
+# Demonstrates:
+#   - identifier interpolation via {database_name}
+#   - value bind via :n
+#   - parameter defaults (both args optional)
+#   - description fields that surface to the LLM
+# ============================================================================
+
+dbc_top_tables_by_size:
+  type: tool
+  description: |
+    List the top-N largest tables in a Teradata database, ranked by current
+    permanent space (bytes). Reads DBC.AllSpaceV. Use this for quick "what
+    is filling up this schema" questions; for trend / peak analysis use the
+    dbc_space_cube instead.
+  parameters:
+    database_name:
+      description: "Database / schema to inspect. Default DBC (system catalog)."
+      type_hint: str
+      default: "DBC"
+    n:
+      description: "How many tables to return (default 10)."
+      type_hint: int
+      default: 10
+  sql: |
+    SELECT TOP :n
+      TableName                                AS table_name,
+      SUM(CurrentPerm)                         AS current_perm_bytes,
+      SUM(PeakPerm)                            AS peak_perm_bytes,
+      CAST(SUM(CurrentPerm) / 1024.0 / 1024.0
+           AS DECIMAL(18,2))                   AS current_perm_mb
+    FROM DBC.AllSpaceV
+    WHERE DataBaseName = '{database_name}'
+      AND TableName   <> 'All'
+    GROUP BY TableName
+    ORDER BY current_perm_bytes DESC