You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
> | A2: Integration points | Partial | Rate limits not addressed for any channel API. The burst deploy scenario (twelve teams deploying within ten minutes) could hit Slack's rate limit of ~1 message per second per channel if multiple teams share an alerts channel. | Bug |
164
-
> |C8: Operational impact| Partial | The document specifies monitoring (health endpoint, delivery latency metric) but does not identify who monitors the notification service itself. If notifications silently fail — e.g., the service is down during a deploy — who gets paged? The monitoring observes the service but nobody is named as the operator. | Bug |
164
+
> |A7: Deployment and environment| Partial | The document specifies monitoring (health endpoint, delivery latency metric) but does not identify who monitors the notification service itself. If notifications silently fail — e.g., the service is down during a deploy — who gets paged? The monitoring observes the service but nobody is named as the operator. | Bug |
165
165
166
166
The auditor also raises two non-Bug findings:
167
167
@@ -210,7 +210,7 @@ The auditor re-assesses only the criteria that were Partial or Fail in Pass 1.
210
210
> |---|---|---|
211
211
> | S6: Dependency identification | Pass | Rate limits addressed. One-team-one-channel constraint stated. Slack, Teams, and PagerDuty dependencies documented with their limitations. |
|`templates/retrospective.md`| Retrospective template for rubric evolution. |
33
33
|`CONTRIBUTING.md`| How to propose rubric changes — the contribution model for evolving the framework. |
34
34
|`VERSION`|**Framework version.** Plain-text file containing the current AIDOS framework semver (e.g. `1.0.0`). Read on session start — used to compare against the audited file's `AIDOS Version` metadata. |
Copy file name to clipboardExpand all lines: src/framework.md
+3-3Lines changed: 3 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -99,9 +99,9 @@ AIDOS depends on separation between artifact creation and artifact audit. The sa
99
99
100
100
Every artifact is assessed against two rubric layers:
101
101
102
-
**Core Rubric** — universal criteria that apply to every artifact at every scale. Alignment to goals. Simplicity. Explicit trade-offs. Failure modes. Testability. Observability. Security. Operational impact. Reversibility. Future team readiness. Unit coherence. No duplication.
102
+
**Core Rubric** — universal criteria that apply to every artifact at every scale. Alignment to goals. Simplicity. Explicit trade-offs. Failure modes. Testability. Observability. Security. Reversibility. Future team readiness. Unit coherence. No duplication.
103
103
104
-
**Discipline Rubric** — criteria specific to each artifact type. The Problem rubric (P1–P10) checks clarity, stakeholders, measurability, root cause confidence, scope, assumptions, constraints, impact, and alternatives. The Solution rubric (S1–S9) checks coherence, workflows, edge cases, alternatives, dependencies, migration, and minimum viable slice. The Tech Design rubric (A1–A10) checks components, integration, data model, error handling, technology choices, performance, deployment, and coding agent readiness. The Testing rubric (T1–T9) checks coverage, traceability, scenarios, exit criteria, expected results, test data, environments, regression, and prioritisation. The Definition rubric (F1–F8) checks outcome accuracy, key trade-offs, maintainer orientation, known limitations, operational context, domain placement, standalone comprehension, and currency.
104
+
**Discipline Rubric** — criteria specific to each artifact type. The Problem rubric (P1–P11) checks clarity, stakeholders, measurability, root cause confidence, scope, assumptions, constraints, impact, alternatives, and implementation neutrality. The Solution rubric (S1–S10) checks coherence, workflows, edge cases, alternatives, dependencies, migration, minimum viable slice, and implementation neutrality. The Tech Design rubric (A1–A10) checks components, integration, data model, error handling, technology choices, performance, deployment, and coding agent readiness. The Testing rubric (T1–T9) checks coverage, traceability, scenarios, exit criteria, expected results, test data, environments, regression, and prioritisation. The Definition rubric (F1–F7) checks outcome accuracy, key trade-offs, maintainer orientation, known limitations, operational context, domain placement, and currency.
105
105
106
106
Each criterion has a defined "what pass looks like." The auditor assesses Pass, Partial, or Fail with cited evidence. The evidence requirement is what gives rubrics teeth — you can't hand-wave a Pass. Partials are accepted or rejected by the human directing the audit, not waved through. The artifact doesn't advance until bugs are fixed.
107
107
@@ -355,7 +355,7 @@ The Definition lives in the Feature Repository (`definitions/`), organised by pr
355
355
356
356
**A project cannot close with delivery artifacts unarchived.** Just as a project cannot close with PARKED overflow items, it cannot close without distilling its outcome into Definitions and archiving the delivery stack. The distillation session is how a project ends.
357
357
358
-
The Definition rubric (F1–F8) is assessed through the Maintenance lens. Full criteria are in `src/rubrics/definition.md`.
358
+
The Definition rubric (F1–F7) is assessed through the Maintenance lens. Full criteria are in `src/rubrics/definition.md`.
Copy file name to clipboardExpand all lines: src/prompts/auditor-prompt.md
+18-18Lines changed: 18 additions & 18 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -82,25 +82,24 @@ Then proceed to the audit.
82
82
83
83
## Rubric Criteria
84
84
85
-
### Core Rubric (C1–C13) — Every Artifact, Every Scale
85
+
### Core Rubric (C1–C12) — Every Artifact, Every Scale
86
86
87
87
| # | Criterion | What "Pass" Looks Like |
88
88
|---|---|---|
89
89
| C1 | Alignment to goals | Every element traces to a stated goal or requirement. Nothing is included that doesn't serve a declared purpose. |
90
-
| C2 | Simplicity | The simplest approach that meets the requirements. Complexity is justified where it exists. A simpler alternative was considered and rejected for a stated reason. |
90
+
| C2 | Simplicity | The simplest approach that meets the requirements. Complexity is justified where it exists. |
91
91
| C3 | Explicit trade-offs | Trade-offs are named. Options considered, decision taken, and reasoning are documented. |
92
92
| C4 | Failure modes | What can go wrong and how failures are detected or handled. Silence on failure is itself a failure. |
93
93
| C5 | Testability | Every claim, requirement, or design choice can be verified by a specific action. |
94
94
| C6 | Observability | How you would know — in practice — whether the thing is working or not. |
95
95
| C7 | Security | Security implications considered proportionate to the risk. "Not applicable" is stated, not assumed. |
96
-
| C8 | Operational impact | Who runs this, how it's deployed, what changes for operations. Ownership identified and accepted. |
97
-
| C9 | Reversibility | What can be undone and what can't. Irreversible choices are acknowledged and justified. |
98
-
| C10 | Future team readiness | Someone unfamiliar could pick this up and understand what was done, why, and what's left. |
99
-
| C11 | Internal consistency | Terminology used consistently, sections don't contradict each other, reads as one coherent unit. |
100
-
| C12 | No duplication | References rather than copies. Each fact lives in one place. |
101
-
| C13 | Single unit of work | Addresses a single deliverable that can be independently understood, built, tested, and released. |
96
+
| C8 | Reversibility | What can be undone and what can't. Irreversible choices are acknowledged and justified. |
97
+
| C9 | Future team readiness | Someone unfamiliar could pick this up and understand what was done, why, and what's left. |
98
+
| C10 | Internal consistency | Terminology used consistently, sections don't contradict each other, reads as one coherent unit. |
99
+
| C11 | No duplication | References rather than copies. Each fact lives in one place. |
100
+
| C12 | Single unit of work | Addresses a single deliverable that can be independently understood, built, tested, and released. |
| P9 | Impact and urgency | Cost quantified where possible. Why now. What happens if not addressed. Evidence-based, not assertion-based. |
116
115
| P10 | Existing alternatives | Whether the problem is already solved acknowledged. If alternatives exist, insufficiency is stated. Building is justified. |
116
+
| P11 | Implementation neutrality | Problem describes what's wrong, for whom, why — not how it's solved. Tools, vendors, schemas, APIs absent unless pre-existing constraints (then in P8). Implementation language captured in Overflow tagged for Solution or Tech Design. |
117
117
118
-
### Solution Rubric (S1–S9) — Analysis Lens
118
+
### Solution Rubric (S1–S10) — Analysis Lens
119
119
120
120
| # | Criterion | What "Pass" Looks Like |
121
121
|---|---|---|
@@ -128,6 +128,7 @@ Then proceed to the audit.
128
128
| S7 | Migration and transition | Path from current to proposed state described. Cutover, compatibility, rollback addressed. |
129
129
| S8 | Actor identification | Every person, team, or system that interacts is identified with specific interactions described. |
130
130
| S9 | Constraint compliance | Solution respects Problem constraints. Gaps acknowledged with explicit mitigation or trade-off. |
131
+
| S10 | Implementation neutrality | Solution describes how the response works as a system — actors, workflows, edge cases, alternatives — not which technology executes it. Tables, columns, joins, data types, libraries, services, frameworks belong in Tech Design unless pre-existing constraints (then noted in S9). Implementation detail captured in Overflow tagged for Tech Design. |
| T8 | Regression awareness | Existing functionality at risk identified with regression tests. Proportionate to blast radius. |
159
160
| T9 | Risk-based prioritisation | Must-pass vs should-pass distinguished. Priority clear when time is short. |
160
161
161
-
### Definition Rubric (F1–F8) — Maintenance Lens
162
+
### Definition Rubric (F1–F7) — Maintenance Lens
162
163
163
164
The Definition is the post-delivery artifact — the living description of what was built, maintained as the feature evolves. Its audience is someone who was never in the room.
164
165
165
166
| # | Criterion | What "Pass" Looks Like |
166
167
|---|---|---|
167
168
| F1 | Outcome accuracy | Describes what was actually built, not what was planned. Divergences stated with reason. |
168
169
| F2 | Key trade-offs preserved | Significant decisions captured with context. Not every decision — the shaping ones. |
169
-
| F3 | Maintainer orientation | Answers: what does this do, why this way, limitations, what to know to change it safely. No delivery-process language. |
170
+
| F3 | Maintainer orientation |Self-contained. Answers: what does this do, why this way, limitations, what to know to change it safely. May link to delivery artifacts for forensic detail; reader using only the Definition has enough context. No delivery-process language. |
170
171
| F4 | Known limitations and debt | Tech debt, accepted risks, deferred scope listed explicitly. BACKLOG items represented. |
171
172
| F5 | Operational context | Who owns it, how monitored, failure modes, runbook. Enough for on-call without the full Tech Design. |
172
173
| F6 | Domain placement | Filed by product domain, not project. Findable by domain browsing. |
173
-
| F7 | Standalone comprehension | Self-contained. May link to archived delivery artifacts but doesn't require them. |
174
-
| F8 | Currency | Reflects current system state. Updates visible via version history or "Last updated" summary. |
174
+
| F7 | Currency | Reflects current system state. Updates visible via version history or "Last updated" summary. |
175
175
176
176
**Story scale exception:** Stories do not produce Definitions. They inherit from their parent Feature or Epic Definition. Do not audit a Definition at Story scale.
177
177
178
178
### Story-Scale Subset
179
179
180
180
At story scale, audit is lighter but the criteria still apply. Focus on these as the primary assessment:
-**Risk** — decision required. The human decides: accept, mitigate, or defer. Risks don't block the artifact, but they need explicit disposition.
244
244
-**Idea** — noted, not actioned unless chosen. Ideas do not drive additional audit passes. Table them separately.
245
245
246
-
**C13 failures are always Bugs.** If the artifact is trying to cover too many concerns, recommend decomposition into sibling artifacts at the same scale level.
246
+
**C12 failures are always Bugs.** If the artifact is trying to cover too many concerns, recommend decomposition into sibling artifacts at the same scale level.
0 commit comments