release 2.1.0: added base slice context for fpf. Strict distinction between the methods (Plan, run time) and the work/validation (actual execution). Added formality tracking in hypotheses and novelty/complexity tags

m0n0x41d · m0n0x41d · commit 041d931c3d46 · 2025-12-13T17:36:23.000+04:00
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -5,6 +5,53 @@ All notable changes to Crucible Code will be documented in this file.
 The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
 and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
 
+## [2.1.0] - 2025-12-13
+
+### Added
+
+#### Repository Context
+
+- **`.fpf/context.md`**: Created by `/fpf-0-init` to define the "Base Slice" (Tech Stack, Scale, Constraints).
+- **Context Awareness**: All commands (`hypothesize`, `research`, `test`) now read this file to ground decisions.
+- **CLAUDE.md Update**: Instructions for Claude to check `.fpf/context.md` first.
+
+#### Enhanced Hypothesis Structure
+
+- **Formality (F-Score)**: Added `formality: [0-9]` to hypothesis frontmatter.
+- **NQD Tags**: Added `novelty` and `complexity` to hypothesis frontmatter for diversity tracking.
+- **Strict Method/Work Split**: Hypothesis body restructured into "1. The Method (Design-Time)" and "2. The Validation (Run-Time)" to enforce A.15.
+
+#### Documentation
+
+- **F-Score Definitions**: Added explanation of F0-F9 ranges to README.
+- **TODOs**: Added roadmap items for deeper NQD and Method/Work integration.
+
+## [2.1.0] - 2025-12-13
+
+### Added
+
+#### Agentic Initialization
+
+- **Smart `/fpf-0-init`**: Now scans the repository (package.json, Dockerfile, etc.) to infer tech stack.
+- **Interactive Interview**: Asks the user clarifying questions about Scale, Budget, and Constraints to build a robust Context.
+- **`.fpf/context.md`**: New foundational file that grounds all reasoning in the project's specific reality.
+
+#### Repository Context Integration
+
+- **Context Awareness**: All commands (`hypothesize`, `research`, `test`) now read `.fpf/context.md` to make decisions relevant.
+- **CLAUDE.md Update**: Instructions for Claude to check `.fpf/context.md` first.
+
+#### Enhanced Hypothesis Structure
+
+- **Formality (F-Score)**: Added `formality: [0-9]` to hypothesis frontmatter.
+- **NQD Tags**: Added `novelty` and `complexity` to hypothesis frontmatter for diversity tracking.
+- **Strict Method/Work Split**: Hypothesis body restructured into "1. The Method (Design-Time)" and "2. The Validation (Run-Time)" to enforce A.15.
+
+#### Documentation
+
+- **F-Score Definitions**: Added explanation of F0-F9 ranges to README.
+- **Concepts**: Added simple explanations for NQD and Method vs. Work.
+
 ## [2.0.0] - 2025-12-13
 
 ### Added
diff --git a/CLAUDE.md b/CLAUDE.md
@@ -67,7 +67,8 @@ When reasoning through problems, apply these principles:
 
 ### 2. Investigate the Codebase
 
-- **Check `.fpf/knowledge/` first** — Project knowledge base with verified claims at different assurance levels
+- **Check `.fpf/context.md` first** — Project context, constraints, and tech stack
+- **Check `.fpf/knowledge/`** — Project knowledge base with verified claims at different assurance levels
 - **Check `.context/` directory** — Architectural documentation and design decisions
 - Use Task tool for broader/multi-file exploration (preferred for context efficiency)
 - Explore relevant files and directories
diff --git a/README.md b/README.md
@@ -171,6 +171,16 @@ Problem Statement
 | **L2** | Verified | Empirically tested | `/fpf-3-test` or `/fpf-3-research` |
 | **Invalid** | Disproved | Was wrong — kept for learning | Failed at any stage |
 
+### Formality (F-Score)
+
+Rigor of expression (not truth, but precision).
+
+- **F0-F2 (Sketch):** Rough ideas, whiteboard notes, vague constraints.
+- **F3-F5 (Structured):** Steps, clear constraints, executable code/tests.
+- **F6-F9 (Rigorous):** Formal proofs, math models, machine-checked invariants.
+
+Most engineering work targets **F3-F5**.
+
 ### WLNK (Weakest Link)
 
 **Critical rule:** Assurance = min(evidence), NEVER average.
@@ -185,6 +195,23 @@ Combined Assurance: R_eff = 0.65 (capped by weakest)
 
 One weak piece of evidence caps your entire argument. If you want solid decisions, you need to strengthen — or acknowledge — your weakest link.
 
+### NQD (Novelty, Quality, Diversity)
+
+We track three metrics to ensure we aren't just guessing:
+
+- **Novelty:** How new is this idea? (Conservative vs. Radical). We want a mix.
+- **Quality:** How likely is it to work? (High complexity = Higher risk).
+- **Diversity:** Are we exploring different *types* of solutions? (e.g. Architectural vs. Operational).
+
+### Method vs. Work
+
+Crucible Code strictly separates the **Plan** from the **Result**.
+
+- **Method (Design-Time):** The recipe. The code you plan to write. The "How-To".
+- **Work (Run-Time):** The cooking. The test results. The logs. The "What Happened".
+
+Hypotheses now define the **Method** (Plan) first, then outline the **Validation** (Work) needed to prove it.
+
 ### Congruence (for External Evidence)
 
 External evidence may not apply to your context. Assess congruence:
@@ -379,10 +406,9 @@ The framework teaches itself through use. A few cycles and it clicks.
 
 FPF belongs to Anatoly Levenchuk. This project inherits any FPF-associated licenses. No additional proprietary licenses imposed.
 
+- Edge case handling
+- Documentation clarity
+
 ## Contributing
 
 Issues and PRs welcome. Focus areas:
-
-- Better examples
-- Edge case handling
-- Documentation clarity
diff --git a/commands/fpf-0-init.md b/commands/fpf-0-init.md
@@ -33,7 +33,42 @@ mkdir -p .fpf/decisions
 mkdir -p .fpf/sessions
 ```
 
-### 3. Create Session File
+### 3. Create Context File (Agentic)
+
+**Do not just create a blank file.**
+
+1.  **Investigate:** Scan the repository for technical signals.
+    - Check `package.json`, `go.mod`, `Cargo.toml`, `requirements.txt`, `pom.xml`, `Gemfile`.
+    - Check `Dockerfile`, `docker-compose.yml`, `k8s/`, `.github/workflows`.
+    - Check `README.md` for architecture notes.
+
+2.  **Draft & Interview:**
+    - Present what you found: "I detected Python 3.11 and Django..."
+    - Ask **specific** questions for what you can't see (Scale, Budget, Constraints).
+    - *Example:* "I see this is a web app. What is the target user scale? (<1k, >1M?)"
+
+3.  **Write `.fpf/context.md`:**
+    - Combine your findings and the user's answers.
+
+```markdown
+# Repository Context (A.2.6 Context Slice)
+
+## Tech Stack (Inferred)
+- **Language:** [e.g. Python 3.11]
+- **Frameworks:** [e.g. Django 4.2]
+- **Infra:** [e.g. Docker, AWS]
+
+## Scale & Performance (User-Defined)
+- **Users:** [Value]
+- **Traffic:** [Value]
+- **Latency Target:** [Value]
+
+## Hard Constraints (User-Defined)
+- [Constraint 1]
+- [Constraint 2]
+```
+
+### 4. Create Session File
 
 Create `.fpf/session.md`:
 
diff --git a/commands/fpf-1-hypothesize.md b/commands/fpf-1-hypothesize.md
@@ -45,6 +45,7 @@ Problem: `$ARGUMENTS.problem`
 ### 1. Load Context
 
 - Read `.fpf/session.md` for any active context
+- Read `.fpf/context.md` for project context (Tech Stack, Scale, Constraints)
 - Read relevant project files to understand constraints
 - Check `.fpf/knowledge/L2/` for verified facts that constrain solution space
 - Check `.fpf/knowledge/invalid/` for approaches already disproven
@@ -94,6 +95,9 @@ type: hypothesis
 created: [timestamp]
 problem: [reference to problem]
 status: L0
+formality: [0-9] # F-Score (0=Sketch, 9=Proof)
+novelty: [Conservative|Novel|Radical]
+complexity: [Low|Medium|High]
 author: Claude (generated), Human (to review)
 scope:
   applies_to: "[conditions where this solution applies]"
@@ -103,13 +107,28 @@ scope:
 
 # Hypothesis: [Clear one-line statement]
 
-## Approach
+## 1. The Method (Design-Time)
+*This is the plan/recipe. (A.15 MethodDescription)*
+
+### Proposed Approach
 [2-3 sentences: what this solution proposes]
 
-## Rationale
+### Rationale
 [Why this might work — the abductive reasoning]
 
-## Plausibility Assessment
+### Implementation Steps
+1. [Step 1]
+2. [Step 2]
+3. [Step 3]
+
+### Expected Capability
+- [Capability Claim 1]
+- [Capability Claim 2]
+
+## 2. The Validation (Run-Time)
+*This section tracks the Work performed to verify this.*
+
+### Plausibility Assessment
 
 | Filter | Score | Justification |
 |--------|-------|---------------|
@@ -120,21 +139,17 @@ scope:
 
 **Plausibility Verdict:** [PLAUSIBLE / MARGINAL / IMPLAUSIBLE]
 
-## Scope of Applicability
-
-**This hypothesis applies when:**
-- [Condition 1]
-- [Condition 2]
-
-**This hypothesis does NOT apply when:**
-- [Condition 1]
-- [Condition 2]
-
-## Assumptions
+### Assumptions to Verify
 - [ ] [Assumption 1 — must be true for this to work]
 - [ ] [Assumption 2]
 - [ ] [Assumption 3]
 
+### Required Evidence
+- [ ] **Internal Test:** [Verification test]
+  - **Performer:** [Developer | AI Agent | CI Pipeline]
+- [ ] **Research:** [External validation]
+  - **Performer:** [Developer | AI Agent]
+
 ## Falsification Criteria
 [What evidence would DISPROVE this hypothesis?]
 - If [X happens], this approach fails
diff --git a/commands/fpf-3-research.md b/commands/fpf-3-research.md
@@ -71,6 +71,9 @@ This phase answers: **"What does the outside world know about this?"**
 # Read hypothesis to research
 cat .fpf/knowledge/L1/[hypothesis].md
 
+# Read project context for congruence assessment
+cat .fpf/context.md
+
 # Extract assumptions needing external validation
 grep -A10 "Assumptions" .fpf/knowledge/L1/[hypothesis].md
 
diff --git a/commands/fpf-3-test.md b/commands/fpf-3-test.md
@@ -85,6 +85,9 @@ This phase answers: **"Does this actually work in practice, in OUR context?"**
 # Read L1 hypotheses to test
 cat .fpf/knowledge/L1/*.md
 
+# Read project context
+cat .fpf/context.md
+
 # Check what assumptions need empirical verification
 grep -A10 "Assumptions" .fpf/knowledge/L1/*.md
 
diff --git a/commands/fpf-status.md b/commands/fpf-status.md
@@ -20,30 +20,27 @@ if [ ! -d ".fpf" ]; then
 fi
 ```
 
-### 2. Gather Statistics
+### 2. Gather Statistics (Agentic)
 
-```bash
-# Count knowledge at each level
-L0_COUNT=$(find .fpf/knowledge/L0 -name "*.md" 2>/dev/null | wc -l)
-L1_COUNT=$(find .fpf/knowledge/L1 -name "*.md" 2>/dev/null | wc -l)
-L2_COUNT=$(find .fpf/knowledge/L2 -name "*.md" 2>/dev/null | wc -l)
-INVALID_COUNT=$(find .fpf/knowledge/invalid -name "*.md" 2>/dev/null | wc -l)
-
-# Count artifacts
-EVIDENCE_COUNT=$(find .fpf/evidence -name "*.md" 2>/dev/null | wc -l)
-DRR_COUNT=$(find .fpf/decisions -name "DRR-*.md" 2>/dev/null | wc -l)
-SESSIONS_COUNT=$(find .fpf/sessions -name "*.md" 2>/dev/null | wc -l)
-```
+**Do not run a complex shell script.** Use your tools to gather data and compute metrics internally.
 
-### 3. Read Session State
+1.  **Count Files:**
+    - List files in `.fpf/knowledge/L0/`, `L1/`, `L2/`, `invalid/` to get counts.
+    - List files in `.fpf/evidence/`, `.fpf/decisions/`, `.fpf/sessions/`.
 
-```bash
-cat .fpf/session.md
-```
+2.  **Analyze Health & Formality:**
+    - **Read Frontmatter:** Read the first 10 lines of all active hypothesis files (L0, L1, L2) and evidence files.
+    - **Compute Internally:**
+        - **Formality Range:** Find the `formality:` value in hypotheses. Record Min and Max.
+        - **Trust Index:** Count total evidence files. Count how many have `valid_until` dates in the future.
+          - `Trust Index = Valid Evidence / Total Evidence`
+        - **Drift:** Identify L2 hypotheses. Check if their linked evidence files are expired.
+          - `Drift = L2 Hypotheses with Expired Evidence / Total L2 Hypotheses`
 
-Extract: `Phase:`, `Problem:`, Active hypotheses, Last transition
+3.  **Read Session State:**
+    - Read `.fpf/session.md` to get the current Phase, Problem, and active hypothesis list.
 
-### 4. Check for Issues
+### 3. Check for Issues
 
 Scan for:
 - Expired evidence (check `valid_until` dates)
@@ -64,6 +61,15 @@ Scan for:
 | **Started** | [timestamp] |
 | **Last Activity** | [timestamp] |
 
+### Project Health (Current Session)
+
+| Metric | Value |
+|--------|-------|
+| **Min Formality** | [MIN_FORMALITY] |
+| **Max Formality** | [MAX_FORMALITY] |
+| **Trust Index (Fresh Evidence)** | [TRUST_INDEX] (Valid: [VALID_EVIDENCE_COUNT]/[EVIDENCE_COUNT]) |
+| **Drift (L2 Expired Evidence)** | [DRIFT] (L2 Hypotheses with expired evidence: [L2_HYPS_WITH_EXPIRED_EVIDENCE]/[L2_COUNT]) |
+
 ### Knowledge Base
 
 | Level | Count | Description |
diff --git a/install.sh b/install.sh