-
Notifications
You must be signed in to change notification settings - Fork 1
Add Safe Scientific Development System for Claude Code #10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- env_check.sh: Conda environment detection and validation - numerical_validation.sh: Snapshot and invariant checking utilities - Fix .gitignore to allow .claude/hooks/lib/ (was blocked by /lib/ pattern) Part of safe scientific development system. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
- Warns when Python commands run outside conda environment - Reminds to run tests before commits - Blocks snapshot updates without approval Part of safe scientific development system. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
The find command on line 42 was looking for files with a "*.pytest_cache" pattern, which doesn't exist. The .pytest_cache directory contains files with various names (like .gitignore, CACHEDIR.TAG, README.md, v/cache/, etc.), not files ending in .pytest_cache. Changed from: find .pytest_cache -name "*.pytest_cache" -mmin -5 To: find .pytest_cache -type f -mmin -5 This correctly finds any files in the .pytest_cache directory that were modified within the last 5 minutes, properly detecting recent test runs. Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
- Detects when snapshot files have changed - Reminds Claude to provide full analysis before updates Part of safe scientific development system. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
Pragmatic TDD workflow for scientific code with numerical validation. Part of safe scientific development system.
Comprehensive numerical validation workflow with tolerance guidelines. Part of safe scientific development system. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
Zero-tolerance workflow for behavior-preserving refactoring. Part of safe scientific development system. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
…and workflow guide Add three critical sections to CLAUDE.md for safe scientific development: **Task 7 - Critical Operational Rules:** - Mandatory skills usage (scientific-tdd, numerical-validation, safe-refactoring, jax) - Environment enforcement rules (conda activation required) - Guided autonomy boundaries (what Claude can do vs. must ask permission) - Snapshot update approval process with required 4-part analysis format **Task 8 - Numerical Accuracy Standards:** - When numerical validation is required (which files/components) - Tolerance specifications table (1e-14 for refactoring, 1e-10 for algorithms) - Mathematical invariants that must always hold (probabilities, stochastic matrices, etc.) - Validation commands (property tests, golden regression, snapshots) **Task 9 - Workflow Selection Guide:** - Decision tree for selecting appropriate workflow/skill - Task-based guidance (new features, bugs, refactoring, JAX, etc.) - JAX code requirements and best practices - JAX-specific validation checklist These enhancements provide Claude with clear operational guidelines, numerical accuracy requirements, and workflow selection criteria for scientific development. Before: 132 lines After: 357 lines Added: 225 lines Part of safe scientific development system implementation. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
Task 10: Create .claude/skills/README.md - Overview of all 3 skills (scientific-tdd, numerical-validation, safe-refactoring) - Usage guidance for each skill - Workflow summaries - Integration and maintenance information Task 11: Create .claude/hooks/README.md - Overview of both hooks (pre-tool-use.sh, user-prompt-submit.sh) - Utility library documentation (env_check.sh, numerical_validation.sh) - Testing procedures and expected behavior - Debugging guidance - Integration with skills explanation Both READMEs provide clear documentation of the safe scientific development system components. Part of safe scientific development system implementation. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
Task 12: Integration test script (tests/test_safe_dev_system.sh) - Tests all 8 categories: hook utilities, hooks, skills, CLAUDE.md, functional tests, snapshot protection, git config, documentation - Validates complete system installation and functionality - All tests passing Task 13: Comprehensive user guide (docs/SAFE_DEVELOPMENT_GUIDE.md) - Quick start for users and Claude - Three-layer system explanation - Common workflow examples with checkpoints - Approval gate processes with checklists - Numerical tolerance guidelines - Troubleshooting guide - Customization instructions - Best practices and FAQ Both are final documentation/testing deliverables for the safe scientific development system. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
Complete verification of all components, functional testing results, and maintenance procedures. Final deliverable for safe scientific development system. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
- Add proper XML metadata headers to scientific-tdd skill - Add proper XML metadata headers to numerical-validation skill - Add proper XML metadata headers to safe-refactoring skill Headers follow Claude Code skill format with name, description, tags, and version. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR implements a comprehensive three-layer defense system to ensure safe, regression-free scientific development with Claude Code. The system combines automatic enforcement through hooks, workflow guidance through skills, and enhanced documentation to prevent numerical regressions and maintain code quality.
Key changes include:
- Layer 1: Hooks that automatically enforce environment requirements, test reminders, and snapshot protection
- Layer 2: Three skills (scientific-tdd, numerical-validation, safe-refactoring) providing structured workflows for different development scenarios
- Layer 3: Enhanced CLAUDE.md with operational rules, numerical standards, and workflow decision trees
Reviewed Changes
Copilot reviewed 13 out of 14 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| tests/test_safe_dev_system.sh | Integration test suite validating all system components |
| docs/SYSTEM_VERIFICATION.md | Comprehensive verification report documenting system status and capabilities |
| docs/SAFE_DEVELOPMENT_GUIDE.md | User guide explaining workflows, approval gates, and troubleshooting |
| CLAUDE.md | Enhanced with critical operational rules, numerical standards, and workflow selection guide |
| .claude/skills/scientific-tdd/skill.md | Pragmatic TDD workflow for new features with numerical validation |
| .claude/skills/safe-refactoring/skill.md | Zero-tolerance refactoring workflow ensuring exact behavioral match |
| .claude/skills/numerical-validation/skill.md | Comprehensive numerical correctness verification workflow |
| .claude/skills/README.md | Overview of available skills and their usage patterns |
| .claude/hooks/user-prompt-submit.sh | Post-prompt hook detecting snapshot changes |
| .claude/hooks/pre-tool-use.sh | Pre-execution hook enforcing environment and snapshot requirements |
| .claude/hooks/lib/numerical_validation.sh | Utilities for snapshot detection and approval management |
| .claude/hooks/lib/env_check.sh | Utilities for conda environment validation |
| .claude/hooks/README.md | Comprehensive hook documentation and debugging guide |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
- Fix import sorting (ruff I001) - Remove unused variable assignment (ruff F841) - Remove trailing whitespace (ruff W291) - Remove whitespace from blank lines (ruff W293) - Remove unused import (ruff F401) All auto-fixed with ruff check --fix and ruff format. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
Black and ruff have slightly different formatting preferences. Applying black formatting to match CI requirements. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
- Remove black formatting check from CI workflow - Use ruff format exclusively for code formatting - Reformat all files with ruff format Ruff provides equivalent formatting to black with additional linting capabilities, simplifying the toolchain. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
Previous commit included the workflow file but the edit didn't apply correctly. This commit properly removes the black formatting check step. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
Summary
Implements a three-layer defense system to ensure safe, regression-free scientific development with Claude Code:
Key Features
✅ Environment Consistency: Conda environment checking for all Python commands
✅ Regression Prevention: Snapshot change detection with approval gates, test-before-commit reminders
✅ Numerical Accuracy: Tolerance specifications (1e-14 for refactoring, 1e-10 for algorithms), mathematical invariant verification
✅ Guided Autonomy: Claude runs tests/validates automatically, asks permission for commits/snapshot updates
✅ Pragmatic TDD: Test-first for new features, test-verify for simple bugs
✅ JAX Integration: Special handling for JAX code optimization and validation
Components Delivered
Hooks (4 files)
pre-tool-use.sh- Environment validation, test reminders, snapshot protectionuser-prompt-submit.sh- Snapshot change detectionlib/env_check.sh- Conda environment utilitieslib/numerical_validation.sh- Validation and approval utilitiesSkills (3 files)
scientific-tdd- Pragmatic test-driven development workflownumerical-validation- Comprehensive numerical correctness verificationsafe-refactoring- Zero-tolerance behavior-preserving refactoringDocumentation (5 files)
CLAUDE.mdwith critical operational rules, numerical standards, workflow guideSAFE_DEVELOPMENT_GUIDE.md- Comprehensive user guideSYSTEM_VERIFICATION.md- Complete verification reportTesting (1 file)
test_safe_dev_system.sh- Integration test (24/24 checks passing)Test Plan
Verification
Run the integration test:
Expected: All tests pass (24/24)
Usage
After merging, Claude Code will automatically:
See
docs/SAFE_DEVELOPMENT_GUIDE.mdfor complete usage instructions.🤖 Generated with Claude Code