feat(semantic-layer-builder): add meta-skill for interface-to-semantic-layer modeling

AgenticWeb4 · cursoragent · AgenticWeb4 · commit 059ca89bd1ad · 2026-06-29T11:41:01.000+08:00
Co-authored-by: Cursor &lt;cursoragent@cursor.com&gt;
diff --git a/README.md b/README.md
@@ -67,6 +67,7 @@ SemanticSkills/
 | --- | --- | --- | --- |
 | `huawei-cloud-billing-scout` | 2.3.8 | **Huawei Cloud Read-Only Billing — Spend, Charges & Reconciliation** — one-page BSS briefing via KooCLI | [details](docs/skills/huawei-cloud-billing-scout.md) · [changelog](qa/huawei-cloud-billing-scout/CHANGELOG.md) |
 | `huawei-cloud-cost-estimation` | 1.0.0 | **Huawei Cloud Pre-Order Cost Estimation** — period and on-demand quotes via hcloud BSS | [details](docs/skills/huawei-cloud-cost-estimation.md) · [changelog](qa/huawei-cloud-cost-estimation/CHANGELOG.md) |
+| `semantic-layer-builder` | 0.1.0 | **Semantic Layer Builder** — meta-skill: interface → governed Kimball semantic layer + OKF export | [details](docs/skills/semantic-layer-builder.md) · [changelog](qa/semantic-layer-builder/CHANGELOG.md) |
 
 Index: [docs/catalog.yml](docs/catalog.yml).
 
diff --git a/docs/catalog.yml b/docs/catalog.yml
@@ -30,3 +30,17 @@ skills:
     compatibility: hcloud KooCLI 7.2+, BSS IAM (bss:order:view permission), outbound network; no agent auto-install
     marketplaces: [skills-sh, skillsmp]
     distribution: direct-skill
+  - id: semantic-layer-builder
+    path: skills/semantic-layer-builder
+    qa: qa/semantic-layer-builder
+    display_name: "Semantic Layer Builder — Interface to Governed Semantic Layer (Meta-Skill)"
+    display_name_zh: "语义层构建器 · 接口转语义元技能"
+    domain: meta
+    subdomains: [semantic-layer, dimensional-modeling, ontology, okf]
+    summary: Meta-skill—turns a REST/OpenAPI, CLI, or table/DDL interface into a governed Kimball semantic layer via a guided one-fact-at-a-time interview (facts/grain/dimensions/measures/routing); emits semantic objects as repo YAML or markdown and a Google OKF v0.1 bundle; evidence-only, refuses to invent fields or values
+    summary_zh: 元技能 · 把 REST/CLI/数据表接口逐项确认建成 Kimball 语义层（事实/粒度/维度/度量/路由）；输出仓库 YAML 或 markdown，并可导出 Google OKF v0.1；只依据接口证据，不臆造
+    version: "0.1.0"
+    agents: [cursor, claude-code, codex, openclaw]
+    compatibility: No external tools or network; pure modeling + file generation; optional YAML/markdown linter for validation
+    marketplaces: [skills-sh, skillsmp]
+    distribution: direct-skill
diff --git a/docs/skills/semantic-layer-builder.md b/docs/skills/semantic-layer-builder.md
@@ -0,0 +1,87 @@
+# 语义层构建器（元技能）
+
+`semantic-layer-builder` · **Semantic Layer Builder — Interface to Governed Semantic Layer (Meta-Skill)**
+
+把一份**接口契约**（REST/OpenAPI、CLI 帮助、或数据表/DDL）通过**逐项确认**的引导式访谈，建成受治理的 Kimball 语义层：先锁事实与粒度，再挂维度与度量，最后按 Schema 生成语义对象。输出本仓 YAML 或 markdown，并可导出 **Google OKF v0.1**（推荐）。唯一真源是用户给的接口——**不臆造**字段、粒度、枚举或取值。
+
+> **元技能** · 它生产的是「别的领域的语义层」，本身不连任何云或数据库。
+
+**Version:** 0.1.0 · Changelog: [qa/semantic-layer-builder/CHANGELOG.md](../../qa/semantic-layer-builder/CHANGELOG.md)
+
+## What it does
+
+| Capability | Typical questions |
+| --- | --- |
+| Ingest interface | 把 REST/OpenAPI、CLI help、表/DDL 归一成操作清单 |
+| Confirm facts & grain | 「一行 = 什么」？锁定可证伪的 grain |
+| Model dimensions | conformed / snowflake / scope / degenerate / abstract / encoded_constant |
+| Model measures | 可加性（additive / semi / non-additive）+ 口径 |
+| Routing & boundary | entry_points 与 evidence_boundary（不能回答什么） |
+| Emit | 本仓 Kimball YAML / markdown / **Google OKF v0.1（推荐）** |
+| Out of scope | 不臆造接口未给的字段/枚举/取值；写操作只 frame 不擅自建模 |
+
+## In-skill flow
+
+```text
+Interface (REST / CLI / table)
+     │
+     ▼
+Phase 1 · Ingest ── 操作清单（name/method/safety/inputs/outputs/doc）
+     │
+     ▼
+Phase 1 · Frame ── fact | dimension-lookup | scope（按 fact·dimension·measure·time·scope 路由）
+     │
+     ▼
+Phase 2 · Confirm ── 一次一问，逐步回显小表：Facts&Grain → Dimensions → Measures → Selection/Pairing → Routing/Boundary
+     │
+     ▼
+Phase 3 · Emit ── 按 schema-spec 生成；okf-emitter 导出 OKF；跑 Conformance
+```
+
+## SKILL.md structure
+
+| Section | Purpose |
+| --- | --- |
+| Workflow | Phase 1 Ingest & Frame → Phase 2 Confirm（逐项）→ Phase 3 Emit |
+| Critical Rules | Evidence-only；grain first；one blocking ask；layer split；stable names |
+| Reference Index | 何时加载 playbook / schema-spec / okf-emitter / examples |
+
+## Runtime bundle (install payload)
+
+```text
+skills/semantic-layer-builder/
+├── SKILL.md
+└── references/
+    ├── elicitation-playbook.md   # 抽取规则 + 确认顺序 + 检查表
+    ├── schema-spec.md            # 语义对象 Schema 约束 + Conformance
+    ├── okf-emitter.md            # Google OKF v0.1 映射 + 硬约束
+    └── examples.md               # 接口 → 确认 → YAML + OKF 端到端样例
+```
+
+No `evals/`, `qa/`, or `*-workspace/` under `skills/`.
+
+## QA (not installed with skill)
+
+```text
+qa/semantic-layer-builder/
+├── validate.sh
+├── VERSION
+├── .markdownlint.json
+├── evals/evals.json             # 5 offline eval cases
+└── assertions/README.md
+```
+
+```bash
+./qa/semantic-layer-builder/validate.sh
+```
+
+## Install
+
+```bash
+npx skills add ontology-of-everything/SemanticSkills \
+  --skill semantic-layer-builder \
+  --agent cursor \
+  --copy -y
+```
+
+No external tools or network required — pure modeling and file generation.
diff --git a/qa/semantic-layer-builder/.markdownlint.json b/qa/semantic-layer-builder/.markdownlint.json
@@ -0,0 +1,12 @@
+{
+  "default": true,
+  "MD003": { "style": "atx" },
+  "MD004": { "style": "dash" },
+  "MD013": { "line_length": 200, "code_blocks": false, "tables": false },
+  "MD024": { "siblings_only": true },
+  "MD033": false,
+  "MD034": false,
+  "MD040": false,
+  "MD041": false,
+  "MD046": false
+}
diff --git a/qa/semantic-layer-builder/CHANGELOG.md b/qa/semantic-layer-builder/CHANGELOG.md
@@ -0,0 +1,24 @@
+# semantic-layer-builder Changelog
+
+Skill-only history. Repository tooling changes: [../../CHANGELOG.md](../../CHANGELOG.md).
+
+## 0.1.0 - 2026-06-29
+
+First release — meta-skill that turns an interface into a governed semantic layer.
+
+### Features
+
+- Guided, one-fact-at-a-time interview: Ingest → Frame → Confirm (facts/grain/dimensions/measures/routing) → Emit
+- Schema constraints for semantic objects (Kimball star/constellation): `catalog` (thin router) → shared-dimensions → model, in `references/schema-spec.md` with a Conformance checklist
+- Google OKF v0.1 export (recommended): concept-per-file mapping, reserved files, frontmatter, and OKF §9 hard-constraint check in `references/okf-emitter.md`
+- Evidence-only discipline: refuses to invent fields, grain, enums, or values; missing sources marked `TODO(verify)`
+- Ingest rules for REST/OpenAPI, CLI, and table/DDL inputs; end-to-end worked example (`references/examples.md`)
+
+### qa
+
+- `validate.sh` (layout, version sync, skills-ref, markdownlint, skillcheck)
+- Five offline evals in `evals/evals.json` (grain-first, no-invented-fields, OKF default target, one-blocking-ask, layer-split)
+
+### Documentation
+
+- `docs/skills/semantic-layer-builder.md`; `docs/catalog.yml` index entry
diff --git a/qa/semantic-layer-builder/README.md b/qa/semantic-layer-builder/README.md
@@ -0,0 +1,25 @@
+# semantic-layer-builder QA
+
+Per-skill quality gate. Run `validate.sh` locally and in CI via
+`tools/validate-all.sh`.
+
+## Layout
+
+```text
+qa/semantic-layer-builder/
+├── validate.sh              # entry point (required)
+├── README.md
+├── evals/evals.json         # Skill Creator eval cases
+├── assertions/README.md     # assertion rubric for eval authors
+├── fixtures/                # optional: contract YAML, golden files
+└── bin/                     # optional: helper scripts
+```
+
+Add `fixtures/` and `bin/` when a skill needs cross-layer checks beyond
+`skills-ref`, markdownlint, and skillcheck.
+
+## Commands
+
+```bash
+./qa/semantic-layer-builder/validate.sh
+```
diff --git a/qa/semantic-layer-builder/VERSION b/qa/semantic-layer-builder/VERSION
@@ -0,0 +1 @@
+0.1.0
diff --git a/qa/semantic-layer-builder/assertions/README.md b/qa/semantic-layer-builder/assertions/README.md
@@ -0,0 +1,8 @@
+# Assertions
+
+Define objective checks for Skill Creator runs. Copy assertions into each eval in `../evals/evals.json`.
+
+Include interaction-discipline cases when the skill has routing or scope decisions:
+
+- User already stated scope, billing cycle, or read-only intent → agent proceeds without re-asking.
+- Blocking ID missing (partner customer, full order id) → one clarifying question, not a questionnaire.
diff --git a/qa/semantic-layer-builder/evals/evals.json b/qa/semantic-layer-builder/evals/evals.json
@@ -0,0 +1,70 @@
+{
+  "skill_name": "semantic-layer-builder",
+  "evals": [
+    {
+      "id": 1,
+      "name": "grain-first-before-dimensions",
+      "prompt": "这是一个 REST 接口：GET /v1/orders 返回 { orders: [ { order_id, store_id, placed_at, total_amount } ] }。帮我建语义层。",
+      "expected_output": "先确认事实与 grain（一行=一笔订单）再挂维度；回显小表请确认；不直接跳到生成。",
+      "expectations": [
+        "Identifies a fact for the orders array before listing dimensions",
+        "States a falsifiable grain such as one row per order (orders[] element)",
+        "Echoes a confirmation table (fact/grain or dimensions) and asks to confirm before emitting",
+        "Does not produce final YAML/OKF before grain is confirmed"
+      ],
+      "files": []
+    },
+    {
+      "id": 2,
+      "name": "no-invented-fields",
+      "prompt": "把这个 CLI 转成语义层：`mytool list-invoices --customer` 输出每行 invoice_id, amount, currency。币种字典接口我没给你。",
+      "expected_output": "只用给出的字段；缺失的字典来源标 TODO(verify)，不编造 source_operations 或枚举值。",
+      "expectations": [
+        "Only models fields evidenced by the interface (invoice_id, amount, currency)",
+        "Marks the missing currency dictionary source as TODO(verify) instead of inventing an operation",
+        "Does not fabricate enum values, codes, or field paths not present in the input",
+        "Lists open TODO(verify) items in the deliverable"
+      ],
+      "files": []
+    },
+    {
+      "id": 3,
+      "name": "okf-default-target",
+      "prompt": "我有一个 orders 表（order_id PK, store_id FK, placed_at, total_amount）。直接生成语义对象，用推荐格式。",
+      "expected_output": "默认/推荐 Google OKF v0.1：每概念一文件、frontmatter 含非空 type、根 index.md 仅 okf_version；事实/维度/度量分别为 concept。",
+      "expectations": [
+        "Chooses Google OKF v0.1 as the recommended/default output target",
+        "Each concept is one markdown file with YAML frontmatter containing a non-empty type",
+        "Fact, dimension(s), and measure(s) are emitted as separate concepts",
+        "Root index.md carries only okf_version frontmatter (or none in sub-index)"
+      ],
+      "files": []
+    },
+    {
+      "id": 4,
+      "name": "one-blocking-ask-on-grain-ambiguity",
+      "prompt": "把 GET /v1/usage 转语义层，响应是 { data: [...] }，结构我暂时说不清。",
+      "expected_output": "在改变建模结果的歧义（grain/响应结构未知）处一次一问，给候选；不做问卷连环问，不臆造结构。",
+      "expectations": [
+        "Asks one blocking question about the unknown response structure / grain",
+        "Offers candidate grains or asks for the array element shape",
+        "Does not invent a schema or proceed to emit without the structure",
+        "Does not fire a long multi-question questionnaire"
+      ],
+      "files": []
+    },
+    {
+      "id": 5,
+      "name": "layer-split-values-to-contract",
+      "prompt": "ListProducts 返回 category 字段，枚举有 ELECTRONICS/FOOD/TOYS 等几十个。帮我建语义层维度。",
+      "expected_output": "语义层维度只描述形态（kind/business_key/source_operations），枚举取值不内联进语义对象，指向命令/契约层。",
+      "expectations": [
+        "Models category as a dimension with kind and business_key",
+        "Does NOT inline the full enum value list into the semantic object",
+        "Points enum/code values to the command/contract layer (notes references contract)",
+        "Keeps the semantic layer thin (shape, not concrete typed-in values)"
+      ],
+      "files": []
+    }
+  ]
+}
diff --git a/qa/semantic-layer-builder/validate.sh b/qa/semantic-layer-builder/validate.sh
@@ -0,0 +1,69 @@
+#!/usr/bin/env bash
+# 通用 skill QA 模板：布局 + skills-ref + markdownlint + skillcheck。
+set -euo pipefail
+
+ROOT="$(cd "$(dirname "${BASH_SOURCE[0]}")/../.." && pwd)"
+QA_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
+SKILL_DIR="$(cd "$QA_DIR/../../skills/semantic-layer-builder" && pwd)"
+
+fail() { printf 'FAIL: %s\n' "$1" >&2; exit 1; }
+need_cmd() { command -v "$1" >/dev/null 2>&1 || fail "missing command: $1"; }
+
+run_local_or_npx() {
+  local bin=$1; shift
+  if command -v "$bin" >/dev/null 2>&1; then "$bin" "$@"
+  else need_cmd npx; npx "$bin" "$@"; fi
+}
+
+# skill 安装包纯度：不含 eval/qa/.workspaces 等
+check_skill_layout() {
+  [[ -f "$SKILL_DIR/SKILL.md" ]] || fail "missing SKILL.md"
+  [[ -f "$QA_DIR/README.md" ]] || fail "missing QA README"
+  local item
+  local forbidden=(.DS_Store .agents analysis evals qa scripts tests .workspaces)
+  for item in "${forbidden[@]}"; do
+    [[ ! -e "$SKILL_DIR/$item" ]] || fail "forbidden in skill dir: $item"
+  done
+  local sibling
+  for sibling in "$SKILL_DIR"/*-workspace; do
+    [[ -e "$sibling" ]] || continue
+    fail "Skill Creator workspace belongs at repo root, not skills/: $(basename "$sibling")"
+  done
+  [[ -f "$QA_DIR/evals/evals.json" ]] || fail "missing evals file: $QA_DIR/evals/evals.json"
+  [[ -f "$QA_DIR/assertions/README.md" ]] || fail "missing assertions guide"
+  [[ ! -f "$QA_DIR/evals.json" ]] || fail "duplicate eval source: $QA_DIR/evals.json"
+}
+
+check_version_sync() {
+  need_cmd python3
+  QA_DIR="$QA_DIR" ROOT="$ROOT" SKILL_DIR="$SKILL_DIR" python3 - <<'PY'
+import os, sys
+from pathlib import Path
+try:
+    import yaml
+except ImportError:
+    sys.exit("FAIL: PyYAML required for version sync check")
+qa = Path(os.environ["QA_DIR"])
+root = Path(os.environ["ROOT"])
+skill = Path(os.environ["SKILL_DIR"])
+expected = (qa / "VERSION").read_text(encoding="utf-8").strip()
+catalog = yaml.safe_load((root / "docs/catalog.yml").read_text(encoding="utf-8"))
+entry = next(x for x in catalog.get("skills", []) if x.get("id") == "semantic-layer-builder")
+if entry.get("version") != expected:
+    sys.exit(f"FAIL: docs/catalog.yml version {entry.get('version')!r} != qa/VERSION ({expected})")
+body = skill.joinpath("SKILL.md").read_text(encoding="utf-8").split("---", 2)[1]
+meta = yaml.safe_load(body).get("metadata") or {}
+if meta.get("version") != expected:
+    sys.exit(f"FAIL: SKILL.md metadata.version != qa/VERSION ({expected})")
+PY
+}
+
+need_cmd rg
+check_skill_layout
+check_version_sync
+run_local_or_npx skills-ref validate "$SKILL_DIR"
+run_local_or_npx markdownlint-cli2 --config "$QA_DIR/.markdownlint.json" "$SKILL_DIR/**/*.md"
+need_cmd skillcheck
+skillcheck "$SKILL_DIR" --target-agent cursor --strict-cursor --min-desc-score 70
+
+printf 'OK: semantic-layer-builder validation passed\n'
diff --git a/skills/semantic-layer-builder/SKILL.md b/skills/semantic-layer-builder/SKILL.md
@@ -0,0 +1,57 @@
+---
+name: semantic-layer-builder
+description: Turns an interface (REST/OpenAPI, CLI, or table/DDL) into a governed Kimball semantic layer via a guided, one-fact-at-a-time interview—confirm facts, grain, dimensions, measures—then emits semantic objects as YAML/markdown or a Google OKF (Open Knowledge Format) bundle. Use when modeling an API/CLI/table into a semantic layer, ontology, or dimensional (star/snowflake) model, or exporting OKF. 触发词：语义层 / 元技能 / 接口(REST/CLI/数据表)转语义 / 事实·粒度·维度·度量 / 维度建模 / 本体 / OKF。Refuses to invent fields, grain, or values the interface does not evidence.
+license: Apache-2.0
+compatibility: No external tools or network; pure modeling + file generation. Optional YAML/markdown linter.
+metadata:
+  author: ontology-of-everything
+  version: "0.1.0"
+---
+
+# Semantic Layer Builder（接口 → 语义层 元技能）
+
+把**接口契约**（REST/OpenAPI、CLI、表/DDL）建成**受治理语义层**：确认事实与粒度 → 挂维度与度量 → 按 Schema 生成，可导出 **Google OKF**（推荐）。
+
+> 唯一真源是接口。**不臆造**字段/粒度/枚举/取值；缺证据停下，一次一问。
+
+## Workflow
+
+### Phase 1 · Ingest & Frame
+
+1. **Ingest** — 接口归一成「操作清单」：每个 op 取 `name / method / safety(读写) / inputs / outputs / doc`（规则见 `references/elicitation-playbook.md` §1）。
+2. **Frame** — 每个 op 定一句业务过程，归类 fact / dimension-lookup / scope。按 fact·dimension·measure·time·scope 路由。
+
+### Phase 2 · Confirm（一次一问，每步回显小表待确认）
+
+顺序见 `elicitation-playbook.md` §3：
+
+1. **Facts & Grain** — 「一行=什么」；标 role、parent_fact。
+2. **Dimensions** — `kind` / `business_key` / `source_operations` / `attributes`；共享维度复用，命名稳定。
+3. **Measures** — 可加性（additive / semi / non）+ 口径。
+4. **Selection & pairing** — 抽象维 `selection_rule`；条件必填（线性产品配 `resource_size + measure_id`）。
+5. **Routing & boundary** — `entry_points` + `evidence_boundary`（不能回答什么）。
+
+> 缺 never-assume（grain / business_key / 读写 / 条件必填触发器）→ 必须问。缺 safe-default（可选属性、命名）→ 披露后继续。
+
+### Phase 3 · Emit
+
+1. **Target** — `okf`（Google OKF v0.1，**默认/推荐**）/ `repo-yaml` / `markdown`。
+2. **Generate** — 按 `schema-spec.md` 生成；OKF 映射按 `okf-emitter.md`。
+3. **Verify** — 跑 `schema-spec.md` §6 Conformance；OKF 再跑 `okf-emitter.md` §6 三条硬约束。逐条回报 pass/fail。
+
+## Critical Rules
+
+1. **Evidence-only** — 字段/粒度/枚举/取值仅来自接口；缺证据写 `TODO(verify)` 并列入待确认，不编造。
+2. **Grain first** — 未锁 grain 不挂维度、不写度量。
+3. **One blocking ask** — 仅在改变建模结果处停（一次一问，带候选），不连环问。
+4. **Layer split** — 形态/路由/粒度归语义层；可键入或读取的取值（枚举、code、字段路径、命令）归契约层，语义层只指向。
+5. **Stable names** — 事实/维度名确认后不重命名（OKF 概念 ID = 文件路径，改名即断链）。
+
+## Reference Index（按需加载）
+
+| 何时读 | 文件 |
+| --- | --- |
+| Phase 1–2 抽取 + 确认顺序 + 检查表 | `references/elicitation-playbook.md` |
+| Phase 3 Schema 约束 + Conformance | `references/schema-spec.md` |
+| 导出 OKF：映射 / 保留文件 / 硬约束 | `references/okf-emitter.md` |
+| 端到端样例 | `references/examples.md` |
diff --git a/skills/semantic-layer-builder/references/elicitation-playbook.md b/skills/semantic-layer-builder/references/elicitation-playbook.md
diff --git a/skills/semantic-layer-builder/references/examples.md b/skills/semantic-layer-builder/references/examples.md
diff --git a/skills/semantic-layer-builder/references/okf-emitter.md b/skills/semantic-layer-builder/references/okf-emitter.md
diff --git a/skills/semantic-layer-builder/references/schema-spec.md b/skills/semantic-layer-builder/references/schema-spec.md