Skip to content

Commit 11df800

Browse files
committed
feat: add trust provenance scoring (#52)
* feat: add trust provenance scoring Signed-off-by: Michael Kantor <6068672+kantorcodes@users.noreply.github.com> * fix: restore ci command resolution Signed-off-by: Michael Kantor <6068672+kantorcodes@users.noreply.github.com> * fix: address trust provenance review feedback Signed-off-by: Michael Kantor <6068672+kantorcodes@users.noreply.github.com> * fix: tighten trust provenance validation Signed-off-by: Michael Kantor <6068672+kantorcodes@users.noreply.github.com> --------- Signed-off-by: Michael Kantor <6068672+kantorcodes@users.noreply.github.com>
1 parent 3d2854d commit 11df800

22 files changed

Lines changed: 1620 additions & 7 deletions

.github/workflows/ci.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -23,8 +23,8 @@ jobs:
2323
with:
2424
enable-cache: true
2525
- run: uv sync --frozen --extra dev --python ${{ matrix.python-version }}
26-
- run: uv run --no-sync ruff check src/
27-
- run: uv run --no-sync ruff format --check src/
26+
- run: uv run --no-sync python -m ruff check src/
27+
- run: uv run --no-sync python -m ruff format --check src/
2828
- run: uv run --no-sync pytest --tb=short
2929

3030
cross-platform:

README.md

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -45,6 +45,22 @@ If your repository uses a Codex marketplace root like `.agents/plugins/marketpla
4545

4646
The score remains available as a trust and triage signal, but the primary workflow is **preflight + CI gating + publish readiness**.
4747

48+
## Trust Score Provenance
49+
50+
The scanner now emits explicit trust provenance alongside the quality grade:
51+
52+
- bundled skills inherit the live HOL broker adapter model from HCS-28 and HCS-26 alignment work
53+
- MCP configuration trust is documented in a local draft spec
54+
- top-level Codex plugin trust is documented in a local draft spec
55+
56+
Current local specs:
57+
58+
- [Skill Trust Local Draft](docs/trust/skill-trust-local.md)
59+
- [MCP Trust Draft](docs/trust/mcp-trust-draft.md)
60+
- [Codex Plugin Trust Draft](docs/trust/plugin-trust-draft.md)
61+
62+
This keeps the quality grade and the trust score separate. Signals like `SECURITY.md` are still visible, but their trust weight is now explicit instead of being inferred from raw category points.
63+
4864
## Quick Start For Contributors
4965

5066
```bash

docs/trust/mcp-trust-draft.md

Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,45 @@
1+
# HOL-HCS-MCP-TRUST-DRAFT
2+
3+
## Scope
4+
5+
This local draft defines trust attribution for Codex plugin `.mcp.json` configuration.
6+
7+
## Goals
8+
9+
- explain how MCP trust is derived
10+
- separate MCP trust from the broader plugin quality grade
11+
- make transport and execution risk explicit instead of burying it in category points
12+
13+
## Adapters
14+
15+
### `verification` weight `1.0`
16+
17+
Internal component weights:
18+
19+
- `configIntegrity`: `40`
20+
- `executionSafety`: `35`
21+
- `transportSecurity`: `25`
22+
23+
Signal mapping:
24+
25+
- `configIntegrity`: `.mcp.json` parses and exposes expected top-level containers
26+
- `executionSafety`: local MCP commands avoid dangerous execution patterns
27+
- `transportSecurity`: remote MCP endpoints remain on HTTPS
28+
29+
### `metadata` weight `0.75`
30+
31+
Internal component weights:
32+
33+
- `serverNaming`: `25`
34+
- `commandOrEndpoint`: `45`
35+
- `configShape`: `30`
36+
37+
Signal mapping:
38+
39+
- `serverNaming`: MCP surfaces are explicitly named
40+
- `commandOrEndpoint`: every MCP surface declares a concrete command or endpoint
41+
- `configShape`: local arguments and remote entries follow the expected shape
42+
43+
## Normalization
44+
45+
The scanner emits a weighted adapter `score` plus component-level evidence, then normalizes the adapter total as the average of declared components. The final MCP trust score is the weighted average of adapter totals.

docs/trust/plugin-trust-draft.md

Lines changed: 54 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,54 @@
1+
# HOL-HCS-CODEX-PLUGIN-TRUST-DRAFT
2+
3+
## Scope
4+
5+
This local draft defines trust attribution for top-level Codex plugins.
6+
7+
## Why this exists
8+
9+
The scanner’s quality grade remains useful for gating, but it is not a trust specification. This draft explains how the scanner computes a separate plugin trust score with explicit weights and named evidence so contributors can see exactly how signals such as `SECURITY.md` affect the outcome.
10+
11+
## Adapters
12+
13+
### `verification` weight `1.0`
14+
15+
Internal component weights:
16+
17+
- `manifestIntegrity`: `35`
18+
- `interfaceIntegrity`: `25`
19+
- `pathSafety`: `20`
20+
- `marketplaceAlignment`: `20`
21+
22+
### `security` weight `1.0`
23+
24+
Internal component weights:
25+
26+
- `disclosure`: `15`
27+
- `license`: `10`
28+
- `secretHygiene`: `35`
29+
- `mcpSafety`: `20`
30+
- `approvalHygiene`: `20`
31+
32+
`SECURITY.md` is deliberately only one small part of the security adapter. It is not intended to dominate trust on its own.
33+
34+
### `metadata` weight `0.75`
35+
36+
Internal component weights:
37+
38+
- `documentation`: `20`
39+
- `manifestMetadata`: `35`
40+
- `discoverability`: `20`
41+
- `provenance`: `25`
42+
43+
### `operations` weight `0.75`
44+
45+
Internal component weights:
46+
47+
- `actionPinning`: `35`
48+
- `permissionScope`: `20`
49+
- `untrustedCheckout`: `25`
50+
- `updateAutomation`: `20`
51+
52+
## Normalization
53+
54+
The scanner emits a weighted adapter `score` plus component-level evidence, then normalizes the adapter total as the average of declared components. The final plugin trust score is the weighted average of adapter totals.

docs/trust/skill-trust-local.md

Lines changed: 59 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,59 @@
1+
# HOL-HCS-28-SKILL-TRUST-LOCAL-DRAFT
2+
3+
## Scope
4+
5+
This local draft defines how `codex-plugin-scanner` attributes trust to bundled Codex skills before those skills are published into the HOL skill registry.
6+
7+
## Provenance
8+
9+
This model inherits its adapter shape and weights from the live HOL broker implementation that scores published HCS-26 skills:
10+
11+
- `verified` adapter weight: `1.0`
12+
- `safety` adapter weight: `1.0`
13+
- `metadata` adapter weight: `0.75`
14+
15+
The scanner keeps those adapter weights and component names so local scores can be compared to registry scores later. It does not use registry cohort normalization locally because a single plugin checkout has no comparable cohort.
16+
17+
## Adapter definitions
18+
19+
### `verified`
20+
21+
Internal component weights:
22+
23+
- `publisherBound`: `20`
24+
- `repoCommitIntegrity`: `40`
25+
- `manifestIntegrity`: `30`
26+
- `domainProof`: `10`
27+
28+
Local mapping:
29+
30+
- `publisherBound`: plugin author metadata exists for the bundled skill package
31+
- `repoCommitIntegrity`: repository metadata plus semver version exists locally
32+
- `manifestIntegrity`: every bundled `SKILL.md` parses frontmatter with required fields
33+
- `domainProof`: homepage and repository hosts align
34+
35+
### `safety`
36+
37+
- single `score` component
38+
- backed by Cisco skill scanning when available
39+
- falls back to a neutral local score when the optional Cisco dependency is unavailable
40+
41+
### `metadata`
42+
43+
Internal component weights:
44+
45+
- `links`: `30`
46+
- `description`: `25`
47+
- `taxonomy`: `20`
48+
- `provenance`: `25`
49+
50+
Local mapping:
51+
52+
- `links`: homepage and repository metadata for the bundled skill package
53+
- `description`: average bundled skill description quality
54+
- `taxonomy`: category and tag coverage
55+
- `provenance`: repository and version provenance present locally
56+
57+
## Normalization
58+
59+
Each adapter emits a weighted `score` plus component-level signals. The scanner normalizes the adapter the same way the broker does today: it averages the declared adapter components, then computes the final trust total as the weighted average of adapter totals.

pyproject.toml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -58,9 +58,11 @@ Issues = "https://github.com/hashgraph-online/codex-plugin-scanner/issues"
5858
[tool.ruff]
5959
target-version = "py310"
6060
line-length = 120
61+
extend-exclude = ["tests/test-trust-scoring.py", "tests/test-trust-specs.py"]
6162

6263
[tool.ruff.lint]
6364
select = ["E", "F", "W", "I", "N", "UP", "B", "A", "SIM", "RUF"]
6465

6566
[tool.pytest.ini_options]
6667
testpaths = ["tests"]
68+
python_files = ["test_*.py"]

schemas/plugin-quality.v1.json

Lines changed: 17 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -46,7 +46,7 @@
4646
"scan": {
4747
"type": "object",
4848
"additionalProperties": false,
49-
"required": ["score", "raw_score", "effective_score", "grade", "findings_total", "severity_counts"],
49+
"required": ["score", "raw_score", "effective_score", "grade", "findings_total", "severity_counts", "trust"],
5050
"properties": {
5151
"score": {"type": "integer", "minimum": 0, "maximum": 100},
5252
"raw_score": {"type": "integer", "minimum": 0, "maximum": 100},
@@ -64,6 +64,22 @@
6464
"low": {"type": "integer", "minimum": 0},
6565
"info": {"type": "integer", "minimum": 0}
6666
}
67+
},
68+
"trust": {
69+
"type": "object",
70+
"additionalProperties": false,
71+
"required": ["total", "domains"],
72+
"properties": {
73+
"total": {"type": "number", "minimum": 0, "maximum": 100},
74+
"domains": {
75+
"type": "array",
76+
"items": {
77+
"type": "object",
78+
"required": ["domain", "label", "score", "spec", "adapters"],
79+
"additionalProperties": true
80+
}
81+
}
82+
}
6783
}
6884
}
6985
},

schemas/scan-result.v1.json

Lines changed: 75 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,7 @@
1515
"effective_score",
1616
"grade",
1717
"summary",
18+
"trust",
1819
"categories",
1920
"findings",
2021
"timestamp",
@@ -81,6 +82,9 @@
8182
}
8283
}
8384
},
85+
"trust": {
86+
"$ref": "#/$defs/trustReport"
87+
},
8488
"categories": {
8589
"type": "array",
8690
"items": {
@@ -224,12 +228,13 @@
224228
"pluginSummary": {
225229
"type": "object",
226230
"additionalProperties": false,
227-
"required": ["name", "pluginDir", "score", "grade", "summary"],
231+
"required": ["name", "pluginDir", "score", "grade", "trust", "summary"],
228232
"properties": {
229233
"name": {"type": "string", "minLength": 1},
230234
"pluginDir": {"type": "string", "minLength": 1},
231235
"score": {"type": "integer", "minimum": 0, "maximum": 100},
232236
"grade": {"type": "string", "enum": ["A", "B", "C", "D", "F"]},
237+
"trust": {"$ref": "#/$defs/trustReport"},
233238
"summary": {
234239
"type": "object",
235240
"additionalProperties": false,
@@ -244,6 +249,75 @@
244249
}
245250
}
246251
},
252+
"trustComponent": {
253+
"type": "object",
254+
"additionalProperties": false,
255+
"required": ["key", "score", "rationale", "evidence"],
256+
"properties": {
257+
"key": {"type": "string", "minLength": 1},
258+
"score": {"type": "number", "minimum": 0, "maximum": 100},
259+
"rationale": {"type": "string", "minLength": 1},
260+
"evidence": {
261+
"type": "array",
262+
"items": {"type": "string"}
263+
}
264+
}
265+
},
266+
"trustAdapter": {
267+
"type": "object",
268+
"additionalProperties": false,
269+
"required": ["id", "label", "weight", "score", "components"],
270+
"properties": {
271+
"id": {"type": "string", "minLength": 1},
272+
"label": {"type": "string", "minLength": 1},
273+
"weight": {"type": "number", "minimum": 0},
274+
"score": {"type": "number", "minimum": 0, "maximum": 100},
275+
"components": {
276+
"type": "array",
277+
"items": {"$ref": "#/$defs/trustComponent"}
278+
}
279+
}
280+
},
281+
"trustDomain": {
282+
"type": "object",
283+
"additionalProperties": false,
284+
"required": ["domain", "label", "score", "spec", "adapters"],
285+
"properties": {
286+
"domain": {"type": "string", "minLength": 1},
287+
"label": {"type": "string", "minLength": 1},
288+
"score": {"type": "number", "minimum": 0, "maximum": 100},
289+
"spec": {
290+
"type": "object",
291+
"additionalProperties": false,
292+
"required": ["id", "version", "path", "derivedFrom"],
293+
"properties": {
294+
"id": {"type": "string", "minLength": 1},
295+
"version": {"type": "string", "minLength": 1},
296+
"path": {"type": "string", "minLength": 1},
297+
"derivedFrom": {
298+
"type": "array",
299+
"items": {"type": "string", "minLength": 1}
300+
}
301+
}
302+
},
303+
"adapters": {
304+
"type": "array",
305+
"items": {"$ref": "#/$defs/trustAdapter"}
306+
}
307+
}
308+
},
309+
"trustReport": {
310+
"type": "object",
311+
"additionalProperties": false,
312+
"required": ["total", "domains"],
313+
"properties": {
314+
"total": {"type": "number", "minimum": 0, "maximum": 100},
315+
"domains": {
316+
"type": "array",
317+
"items": {"$ref": "#/$defs/trustDomain"}
318+
}
319+
}
320+
},
247321
"skippedTarget": {
248322
"type": "object",
249323
"additionalProperties": false,

src/codex_plugin_scanner/cli.py

Lines changed: 16 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -26,26 +26,35 @@
2626

2727
def _build_plain_text(result) -> str:
2828
if getattr(result, "scope", "plugin") == "repository":
29+
trust_total = result.trust_report.total if getattr(result, "trust_report", None) else 0.0
2930
lines = [
3031
f"🔗 Codex Plugin Scanner v{__version__}",
3132
f"Scanning repository: {result.plugin_dir}",
3233
f"Marketplace: {result.marketplace_file or 'not found'}",
3334
f"Local plugins scanned: {len(result.plugin_results)}",
3435
f"Skipped marketplace entries: {len(result.skipped_targets)}",
36+
f"Trust: {trust_total}/100",
3537
"",
3638
"Per-plugin scores:",
3739
]
3840
for plugin in result.plugin_results:
3941
plugin_name = plugin.plugin_name or Path(plugin.plugin_dir).name
40-
lines.append(f" - {plugin_name}: {plugin.score}/100 ({plugin.grade})")
42+
plugin_trust = plugin.trust_report.total if getattr(plugin, "trust_report", None) else 0.0
43+
lines.append(f" - {plugin_name}: {plugin.score}/100 ({plugin.grade}), trust {plugin_trust}/100")
4144
if result.skipped_targets:
4245
lines += ["", "Skipped entries:"]
4346
for skipped in result.skipped_targets:
4447
source_path = f" [{skipped.source_path}]" if skipped.source_path else ""
4548
lines.append(f" - {skipped.name}{source_path}: {skipped.reason}")
4649
lines.append("")
4750
else:
48-
lines = [f"🔗 Codex Plugin Scanner v{__version__}", f"Scanning: {result.plugin_dir}", ""]
51+
trust_total = result.trust_report.total if getattr(result, "trust_report", None) else 0.0
52+
lines = [
53+
f"🔗 Codex Plugin Scanner v{__version__}",
54+
f"Scanning: {result.plugin_dir}",
55+
f"Trust: {trust_total}/100",
56+
"",
57+
]
4958
for category in result.categories:
5059
cat_score = sum(c.points for c in category.checks)
5160
cat_max = sum(c.max_points for c in category.checks)
@@ -57,6 +66,11 @@ def _build_plain_text(result) -> str:
5766
lines.append("")
5867
counts = ", ".join(f"{severity.value}:{result.severity_counts.get(severity.value, 0)}" for severity in Severity)
5968
lines += [f"Findings: {counts}", ""]
69+
if getattr(result, "trust_report", None) and result.trust_report.domains:
70+
lines.append("Trust Provenance:")
71+
for domain in result.trust_report.domains:
72+
lines.append(f" - {domain.label}: {domain.score}/100 ({domain.spec_id})")
73+
lines.append("")
6074
separator = "━" * 37
6175
label = GRADE_LABELS.get(result.grade, "Unknown")
6276
lines += [separator, f"Final Score: {result.score}/100 ({result.grade} - {label})", separator]

0 commit comments

Comments
 (0)