Skip to content
This repository was archived by the owner on Jan 31, 2026. It is now read-only.

Commit 5f043f4

Browse files
authored
Merge pull request #6 from kaaloo/feat/notebook-support
docs: add Jupyter Notebook governance to constitution v1.8.0
2 parents 053e36e + 686eeed commit 5f043f4

3 files changed

Lines changed: 276 additions & 14 deletions

File tree

.specify/memory/constitution.md

Lines changed: 113 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -1,26 +1,35 @@
11
<!--
22
SYNC IMPACT REPORT
33
==================
4-
Version Change: 1.7.0 → 1.7.1
5-
Rationale: PATCH version bump - Reorganized principle ordering by moving Specification-Driven Development (formerly XIII) to position XI, renumbering subsequent principles for better logical flow
4+
Version Change: 1.7.1 → 1.8.0
5+
Rationale: MINOR version bump - Added new Principle XIII: Jupyter Notebook Discipline to govern exploratory data science workflows in government AI projects, and swapped positions of Streamlit-to-Production Bridge (now XIV) and Jupyter Notebook Discipline (now XIII)
66
7-
Modified Principles:
8-
- Principle XI: Now "Specification-Driven Development with SpecKit" (formerly Principle XIII)
9-
- Principle XII: Now "French Government AI Stack Integration" (formerly Principle XI)
10-
- Principle XIII: Now "Streamlit-to-Production Bridge" (formerly Principle XII)
7+
Added Sections:
8+
- Principle XIII: Jupyter Notebook Discipline - Establishes governance for notebooks in top-level notebooks/ folder
9+
* Notebook categorization (exploratory, documentation, production-adjacent)
10+
* Security requirements (credential sanitization, .gitignore enforcement)
11+
* Quality standards (reproducibility, documentation, version control)
12+
* Integration with SpecKit workflow and EU AI Act compliance
13+
* Tooling standards (nbstripout, nbconvert, papermill)
1114
12-
Rationale for Reordering:
13-
- Specification-Driven Development (new XI) logically follows Python-First Development (X) as both are foundational development practices
14-
- French Government AI Stack Integration (new XII) and Streamlit-to-Production Bridge (new XIII) are more specific implementation concerns that build on the foundational principles
15+
Modified Principles:
16+
- Principle XIII: Now "Jupyter Notebook Discipline" (new principle)
17+
- Principle XIV: Now "Streamlit-to-Production Bridge" (formerly Principle XIII)
1518
1619
Removed Sections: N/A
1720
1821
Templates Requiring Updates:
19-
- ✅ plan-template.md: Updated principle numbers in Constitution Check
22+
- ✅ plan-template.md: Updated - Added Principle XIV checkbox to Constitution Check section
2023
- ✅ spec-template.md: Already aligned (no principle-specific references)
2124
- ✅ tasks-template.md: Already aligned (no principle-specific references)
2225
2326
Follow-up TODOs:
27+
- Create feature spec for notebooks/ folder infrastructure (002-jupyter-notebook-support)
28+
- Add nbstripout to pre-commit hooks
29+
- Add notebooks/ to .gitignore patterns for output files
30+
- Create notebook templates (exploratory, documentation, production-adjacent)
31+
- Add notebook linting configuration for ruff
32+
- Document notebook-to-production migration patterns
2433
- Update feature spec 001-setup-developer-experience to reflect pnpm standardization
2534
- Consider adding security homologation dossier template
2635
- Consider adding risk assessment template aligned with ANSSI requirements
@@ -384,7 +393,97 @@ ai-kit MUST provide first-class integrations with the emerging French Government
384393

385394
**Rationale**: Standardizing on government-approved AI infrastructure ensures compliance, reduces duplication, and enables teams to focus on domain-specific value rather than infrastructure.
386395

387-
### XIII. Streamlit-to-Production Bridge
396+
### XIII. Jupyter Notebook Discipline
397+
398+
ai-kit projects MUST maintain Jupyter notebooks in a top-level `notebooks/` directory with clear governance to balance exploratory data science workflows with security, reproducibility, and compliance requirements.
399+
400+
**Notebook Categories**:
401+
402+
Notebooks MUST be organized by purpose to clarify their role in the development lifecycle:
403+
404+
- **Exploratory** (`notebooks/exploratory/`): Rapid experimentation, hypothesis testing, data exploration
405+
- Not subject to SpecKit workflow requirements
406+
- May contain incomplete or experimental code
407+
- MUST NOT contain production credentials or sensitive data
408+
- Should be cleaned up or archived when insights are productionized
409+
410+
- **Documentation** (`notebooks/documentation/`): Tutorials, examples, architectural explanations
411+
- Subject to documentation quality standards
412+
- MUST be reproducible and well-documented
413+
- Should be reviewed as part of feature specifications
414+
- Serve as living documentation for complex AI workflows
415+
416+
- **Production-Adjacent** (`notebooks/production-adjacent/`): Notebooks that inform production decisions
417+
- Model evaluation, performance benchmarking, compliance reporting
418+
- MUST be reproducible and version-controlled
419+
- MUST document data sources, model versions, and evaluation criteria
420+
- Subject to EU AI Act documentation requirements for high-risk AI systems
421+
422+
**Security Requirements (NON-NEGOTIABLE)**:
423+
424+
- Notebooks MUST NOT contain hardcoded credentials, API keys, or sensitive data
425+
- Use environment variables or secure configuration management for secrets
426+
- Implement `nbstripout` or equivalent to remove notebook outputs before commit
427+
- Add `notebooks/**/*.ipynb` output patterns to `.gitignore` (keep source, ignore execution artifacts)
428+
- Conduct security review before publishing notebooks to public repositories
429+
- Document data sources and ensure compliance with GDPR and data protection regulations
430+
431+
**Quality Standards**:
432+
433+
- **Reproducibility**: Notebooks MUST include dependency specifications (requirements.txt, environment.yml, or uv workspace)
434+
- **Documentation**: Each notebook MUST include:
435+
- Purpose and context (what question does this answer?)
436+
- Author and date
437+
- Data sources and versions
438+
- Expected runtime and resource requirements
439+
- Known limitations or assumptions
440+
- **Version Control**: Notebooks MUST be committed with outputs stripped (use `nbstripout` pre-commit hook)
441+
- **Code Quality**: Notebook code SHOULD follow Python standards (ruff linting where practical)
442+
- **Cell Organization**: Use markdown cells to structure narrative, avoid monolithic code cells
443+
444+
**Integration with SpecKit Workflow**:
445+
446+
- **Exploratory notebooks**: Not required to follow SpecKit workflow, but insights MUST be captured in specifications when productionized
447+
- **Documentation notebooks**: Should be referenced in feature specifications (spec.md) and quickstart guides
448+
- **Production-adjacent notebooks**: MUST be documented in `plan.md` research section and referenced in compliance documentation
449+
450+
**EU AI Act Compliance**:
451+
452+
For high-risk AI systems, production-adjacent notebooks MUST:
453+
454+
- Document model training data characteristics (representativeness, quality, completeness)
455+
- Record model evaluation metrics and validation results
456+
- Capture risk assessment findings and mitigation strategies
457+
- Provide audit trail for model selection and hyperparameter tuning decisions
458+
- Support technical documentation requirements for homologation dossier
459+
460+
**Tooling Standards**:
461+
462+
- **nbstripout**: Pre-commit hook to remove outputs before commit
463+
- **nbconvert**: Convert notebooks to scripts or documentation formats
464+
- **papermill**: Parameterize and execute notebooks programmatically for reproducible reporting
465+
- **ruff**: Lint notebook code cells (via `nbqa` or similar)
466+
- **uv**: Manage notebook dependencies within monorepo workspace
467+
468+
**Migration to Production**:
469+
470+
When notebook insights become production features:
471+
472+
1. Extract reusable code into `packages/` or `apps/` with proper testing
473+
2. Document the notebook-to-production migration in feature specification
474+
3. Archive or move exploratory notebooks to `notebooks/archive/` to reduce clutter
475+
4. Retain production-adjacent notebooks for compliance and audit purposes
476+
5. Follow Principle XI (Specification-Driven Development) for production implementation
477+
478+
**Rationale**: Jupyter notebooks are essential for AI/ML experimentation and data science workflows, but without governance they become security risks, compliance liabilities, and sources of technical debt. This principle acknowledges the exploratory nature of notebooks while establishing guardrails that prevent common pitfalls: credential leakage, irreproducible results, and undocumented model decisions. By categorizing notebooks and integrating them with SpecKit workflow, we enable rapid innovation while maintaining traceability for compliance and production migration.
479+
480+
**References**:
481+
- [Jupyter Project](https://jupyter.org/)
482+
- [nbstripout](https://github.com/kynan/nbstripout)
483+
- [Papermill](https://papermill.readthedocs.io/)
484+
- [nbqa](https://github.com/nbQA-dev/nbQA)
485+
486+
### XIV. Streamlit-to-Production Bridge
388487

389488
ai-kit MUST provide a clear migration path from Streamlit prototypes to production-ready applications. This principle addresses the common pattern where:
390489

@@ -575,7 +674,8 @@ All feature specifications and implementation plans MUST include a Constitution
575674
- Python-first development (Principle X)
576675
- Specification-driven development with SpecKit workflows (Principle XI)
577676
- Government AI stack integration requirements (Principle XII)
578-
- Streamlit-to-production support if applicable (Principle XIII)
677+
- Jupyter notebook discipline and governance if applicable (Principle XIII)
678+
- Streamlit-to-production support if applicable (Principle XIV)
579679

580680
### Complexity Justification
581681

@@ -586,4 +686,4 @@ Any deviation from these principles MUST be documented with:
586686
- Plan to return to compliance if possible
587687
- Approval from project stakeholders
588688

589-
**Version**: 1.7.1 | **Ratified**: 2025-10-11 | **Last Amended**: 2025-10-13
689+
**Version**: 1.8.0 | **Ratified**: 2025-10-11 | **Last Amended**: 2025-10-13

.specify/templates/plan-template.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -45,7 +45,8 @@ Verify compliance with ai-kit constitution principles:
4545
- [ ] **Python-First Development (Principle X)**: Is Python the primary language? Are non-Python components justified?
4646
- [ ] **Specification-Driven Development (Principle XI)**: Does this feature follow the SpecKit workflow (specify → plan → tasks → implement)? Are all design artifacts present in specs/[###-feature-name]/? Is traceability maintained from spec to implementation?
4747
- [ ] **French Government AI Stack Integration (Principle XII)**: Does the feature integrate with OpenGateLLM, EvalAP, or other government AI services where applicable?
48-
- [ ] **Streamlit-to-Production Bridge (Principle XIII)**: If using Streamlit, is there a migration path to Reflex? Are ProConnect and DSFR integrations planned?
48+
- [ ] **Jupyter Notebook Discipline (Principle XIII)**: If using notebooks, are they organized in notebooks/ with proper categorization (exploratory/documentation/production-adjacent)? Are security requirements met (no credentials, nbstripout configured)? Are quality standards followed (reproducibility, documentation)?
49+
- [ ] **Streamlit-to-Production Bridge (Principle XIV)**: If using Streamlit, is there a migration path to Reflex? Are ProConnect and DSFR integrations planned?
4950

5051
**Violations Requiring Justification**: [List any principle violations with rationale, or state "None"]
5152

NOTEBOOK_CONSTITUTION_SUMMARY.md

Lines changed: 161 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,161 @@
1+
# Jupyter Notebook Constitution Amendment Summary
2+
3+
**Date**: 2025-10-13
4+
**Version Change**: 1.7.1 → 1.8.0 (MINOR)
5+
**Amendment**: Added Principle XIII: Jupyter Notebook Discipline (swapped with Streamlit-to-Production Bridge, now XIV)
6+
7+
## What Was Added
8+
9+
### New Principle XIII: Jupyter Notebook Discipline
10+
11+
A comprehensive governance framework for Jupyter notebooks in the top-level `notebooks/` directory that balances:
12+
- **Exploratory freedom** for data science experimentation
13+
- **Security requirements** to prevent credential leakage
14+
- **Compliance obligations** for EU AI Act and security homologation
15+
- **Quality standards** for reproducibility and documentation
16+
17+
## Key Components
18+
19+
### 1. Notebook Categorization
20+
21+
Three distinct categories with different governance levels:
22+
23+
- **`notebooks/exploratory/`**: Rapid experimentation, not subject to SpecKit workflow
24+
- **`notebooks/documentation/`**: Tutorials and examples, subject to documentation standards
25+
- **`notebooks/production-adjacent/`**: Model evaluation and compliance reporting, subject to EU AI Act requirements
26+
27+
### 2. Security Requirements (NON-NEGOTIABLE)
28+
29+
- No hardcoded credentials or sensitive data
30+
- `nbstripout` pre-commit hook to remove outputs
31+
- `.gitignore` patterns for notebook execution artifacts
32+
- Security review before public publication
33+
- GDPR compliance for data sources
34+
35+
### 3. Quality Standards
36+
37+
- Reproducibility: dependency specifications required
38+
- Documentation: purpose, author, data sources, runtime requirements
39+
- Version control: outputs stripped before commit
40+
- Code quality: ruff linting where practical
41+
- Cell organization: structured narrative with markdown
42+
43+
### 4. Integration with SpecKit Workflow
44+
45+
- **Exploratory**: Not required to follow SpecKit, but insights must be captured when productionized
46+
- **Documentation**: Referenced in spec.md and quickstart guides
47+
- **Production-adjacent**: Documented in plan.md research section
48+
49+
### 5. EU AI Act Compliance
50+
51+
Production-adjacent notebooks for high-risk AI systems must document:
52+
- Model training data characteristics
53+
- Evaluation metrics and validation results
54+
- Risk assessment findings
55+
- Model selection audit trail
56+
- Technical documentation for homologation dossier
57+
58+
### 6. Tooling Standards
59+
60+
- **nbstripout**: Remove outputs before commit
61+
- **nbconvert**: Convert to scripts/docs
62+
- **papermill**: Parameterize and execute programmatically
63+
- **ruff**: Lint notebook code (via nbqa)
64+
- **uv**: Manage dependencies in monorepo
65+
66+
### 7. Migration to Production
67+
68+
Clear 5-step process:
69+
1. Extract code to `packages/` or `apps/`
70+
2. Document migration in feature spec
71+
3. Archive exploratory notebooks
72+
4. Retain production-adjacent for compliance
73+
5. Follow Principle XI for production implementation
74+
75+
## Why This Matters
76+
77+
### Problems Solved
78+
79+
1. **Security Risk**: Prevents accidental credential commits in notebooks
80+
2. **Compliance Liability**: Ensures notebooks support EU AI Act documentation requirements
81+
3. **Technical Debt**: Provides clear migration path from exploration to production
82+
4. **Irreproducibility**: Mandates dependency management and documentation
83+
5. **Audit Trail**: Establishes governance for model decisions and evaluations
84+
85+
### Alignment with Existing Principles
86+
87+
- **Principle I (EU AI Act)**: Production-adjacent notebooks support compliance documentation
88+
- **Principle III (Security Homologation)**: Security requirements prevent credential leakage
89+
- **Principle IV (Open Source)**: Guidance on what can be published publicly
90+
- **Principle X (Python-First)**: Notebooks align with Python-first culture
91+
- **Principle XI (SpecKit)**: Integration with specification-driven workflow
92+
93+
## Template Updates
94+
95+
### ✅ Completed
96+
97+
- **constitution.md**: Added Principle XIII (Jupyter Notebook Discipline) with full governance framework, swapped with Streamlit-to-Production Bridge (now XIV)
98+
- **plan-template.md**: Updated Constitution Check section with correct principle ordering (XIII: Notebooks, XIV: Streamlit)
99+
100+
### No Changes Required
101+
102+
- **spec-template.md**: No principle-specific references
103+
- **tasks-template.md**: No principle-specific references
104+
105+
## Next Steps (Follow-up TODOs)
106+
107+
When you create the notebooks support feature, you should:
108+
109+
1. **Create feature spec**: `specs/002-jupyter-notebook-support/`
110+
2. **Pre-commit hooks**: Add nbstripout configuration
111+
3. **Gitignore patterns**: Add `notebooks/**/*.ipynb` output patterns
112+
4. **Notebook templates**: Create templates for each category
113+
5. **Ruff configuration**: Add notebook linting via nbqa
114+
6. **Migration guide**: Document notebook-to-production patterns
115+
7. **Directory structure**: Create `notebooks/{exploratory,documentation,production-adjacent,archive}/`
116+
117+
## Rationale for MINOR Version Bump
118+
119+
This is a **MINOR** (1.8.0) rather than PATCH because:
120+
- **New principle added**: Expands governance scope to notebooks
121+
- **New mandatory requirements**: Security and quality standards for notebooks
122+
- **New tooling standards**: nbstripout, papermill, nbqa
123+
- **Material guidance expansion**: Comprehensive framework, not just clarification
124+
125+
Not MAJOR because:
126+
- **No breaking changes**: Existing projects without notebooks are unaffected
127+
- **Backward compatible**: Adds requirements only for new notebook usage
128+
- **No principle removals**: All existing principles remain intact
129+
130+
## Suggested Commit Message
131+
132+
```
133+
docs: amend constitution to v1.8.0 (add Principle XIV: Jupyter Notebook Discipline)
134+
135+
- Add comprehensive governance framework for notebooks/ directory
136+
- Establish security requirements (nbstripout, no credentials)
137+
- Define notebook categories (exploratory, documentation, production-adjacent)
138+
- Integrate with SpecKit workflow and EU AI Act compliance
139+
- Update plan-template.md Constitution Check with Principle XIV
140+
- Provide clear migration path from notebooks to production code
141+
142+
Rationale: Jupyter notebooks are essential for AI/ML experimentation but
143+
require governance to prevent security risks, compliance liabilities, and
144+
technical debt accumulation.
145+
```
146+
147+
## Questions for Clarification
148+
149+
Before creating the feature spec, consider:
150+
151+
1. **Notebook execution environment**: Should notebooks run in the shared `.venv` or isolated environments?
152+
2. **CI/CD integration**: Should notebooks be executed in CI for validation?
153+
3. **Notebook templates**: What starter templates would be most valuable (data exploration, model evaluation, compliance reporting)?
154+
4. **Integration with existing tools**: How should notebooks interact with `apps/` and `packages/`?
155+
5. **Compliance tooling**: Do you need automated compliance checks for production-adjacent notebooks?
156+
157+
---
158+
159+
**Constitution Version**: 1.8.0
160+
**Amendment Status**: ✅ Complete
161+
**Ready for Feature Spec**: Yes

0 commit comments

Comments
 (0)