-
Notifications
You must be signed in to change notification settings - Fork 11
Expand file tree
/
Copy pathpr_compliance_checklist.yaml
More file actions
54 lines (46 loc) · 5.22 KB
/
Copy pathpr_compliance_checklist.yaml
File metadata and controls
54 lines (46 loc) · 5.22 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
pr_compliances:
- title: "Provider capability truthfulness"
compliance_label: true
objective: "Providers must advertise only capabilities implemented end to end."
success_criteria: "Capability flags, public methods, docs, examples, tests, and provider metadata all describe the same implemented behavior."
failure_criteria: "A provider claims predict, score, policy, generate, reason, embed, transfer, or plan support without a complete typed implementation and tests."
- title: "Remote provider network safety"
compliance_label: true
objective: "Provider-controlled URLs and remote responses must not create SSRF, internal network access, unsafe redirects, or resource exhaustion."
success_criteria: "Remote fetches validate scheme and host, reject private or link-local destinations by default, enforce timeout/retry policy, and bound artifact downloads before buffering."
failure_criteria: "The PR fetches arbitrary provider-controlled URLs, follows unchecked redirects, buffers unbounded content, or lacks tests for unsafe URL and oversized artifact cases."
- title: "Optional runtime and checkpoint safety"
compliance_label: true
objective: "Host-owned optional runtimes and checkpoint tooling must not execute untrusted remote model content by default."
success_criteria: "Checkpoint/model loading pins revisions where appropriate, avoids arbitrary config instantiation or unsafe deserialization, and documents every code-execution surface before weights are loaded."
failure_criteria: "The PR executes remote configs, pickles, dynamic imports, or constructors from caller-selected repositories without a narrow allowlist and explicit operator opt-in."
- title: "Secrets and signed artifact URL redaction"
compliance_label: true
objective: "Credentials and temporary artifact URLs must not leak through events, logs, exceptions, transcripts, persisted worlds, or returned metadata."
success_criteria: "All log-facing and caller-facing records redact bearer tokens, API keys, signed URL query strings, credentials, and secret-like metadata while preserving non-secret debugging context."
failure_criteria: "Sanitization happens only in one surface, or raw signed URLs/secrets remain in result metadata, provider events, error messages, saved state, or UI transcripts."
- title: "JSON-native public state"
compliance_label: true
objective: "Public models, provider metadata, planning outputs, benchmark metrics, and persisted worlds must stay JSON-native and coherent."
success_criteria: "Boundary validation accepts only string keys, finite numbers, lists, dictionaries, booleans, strings, and null, with tests for rejected non-native values."
failure_criteria: "The PR allows object instances, tuples, bytes, NaN, infinity, or internally inconsistent metrics/actions into public data or persistence."
- title: "Optional dependency isolation"
compliance_label: true
objective: "Base installs must remain lightweight and independent of torch, CUDA, robotics stacks, Textual, Rerun, credentials, and checkpoints."
success_criteria: "Optional integrations are behind extras, smoke commands, or host-owned setup, and import paths fail gracefully when optional dependencies are absent."
failure_criteria: "The PR adds optional runtime dependencies to the base package or imports optional packages from core modules, CLI root paths, or non-optional provider surfaces."
- title: "Robotics decision evidence"
compliance_label: true
objective: "Robotics demos and adapters must demonstrate why WorldForge belongs in the decision loop."
success_criteria: "The PR shows WorldForge choosing, scoring, explaining, comparing, or exposing counterfactual Go2/SO-101 decisions with reusable traces and measurable outcome fields."
failure_criteria: "The PR only logs or repackages decisions already made by DimOS, LeRobot, a simulator, or a hardcoded script without candidate scores, selected-action rationale, outcomes, or counterfactuals."
- title: "Host-owned robotics runtime boundary"
compliance_label: true
objective: "WorldForge must not become the live robot runtime, account-binding flow, or hardware control stack."
success_criteria: "Go2, SO-101, DimOS, LeRobot, and simulator integrations consume bounded replay/sim artifacts or optional host-owned runtime hooks, avoid live motion commands by default, keep device secrets out of traces, and document hardware requirements separately."
failure_criteria: "The PR sends live robot commands from checkout-safe examples, vendors DimOS/LeRobot runtime code, requires robot dependencies in the base install, leaks host/device identifiers, or assumes account binding/hardware availability in CI."
- title: "Regression tests and operator documentation"
compliance_label: true
objective: "Behavior changes must be testable and operationally clear."
success_criteria: "Bug fixes and documented failure modes include focused tests, and public runtime/provider changes update README/docs/playbooks with commands, success signals, and first triage steps."
failure_criteria: "The PR changes public behavior, provider contracts, runtime setup, persistence, CI, or docs navigation without tests or operator-facing documentation updates."