feat: Complete NumPy specification implementation across 4 major modules#114
feat: Complete NumPy specification implementation across 4 major modules#114
Conversation
- ✅ NumpySpec/random/ (9 files): Random number generation specifications - BitGenerator, MT19937, PCG64, PCG64DXSM, Philox, SFC64 - Standard uniform, normal, and binomial distributions - Proper mathematical properties for RNG algorithms - ✅ NumpySpec/ndarray/ (3 files): Array manipulation specifications - flat, flatten, tofile operations - Memory layout and serialization functions - ✅ NumpySpec/logic_functions/ (29 files): Logic and comparison operations - Comparison: equal, not_equal, greater, less, greater_equal, less_equal - Logical: logical_and, logical_or, logical_not, logical_xor - Bitwise: bitwise_and, bitwise_or, bitwise_xor, bitwise_not - Special: all, any, allclose, array_equal, isclose - Type checking: isfinite, isinf, isnan, isnat, isneginf, isposinf - Advanced: array_equiv, sometrue - ✅ NumpySpec/data_type_routines/ (24 files): Data type operations - Type casting: can_cast, common_type, promote_types, result_type - Type information: dtype, finfo, iinfo, min_scalar_type - Memory operations: shares_memory, may_share_memory - Type objects: find_common_type, typename, typecodes All specifications follow Lean 4 best practices: - Hoare triple syntax: ⦃⌜precondition⌝⦄ function ⦃⇓result => postcondition⦄ - Vector types for compile-time safety - Proper mathematical properties and constraints - Full compilation verified (65 total files) This completes the systematic specification of 4 major NumPy modules representing core array operations, random number generation, logical operations, and data type management functionality. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
|
This PR is a working PR, and currently there's no need to review it until I ask for :-) |
- Remove outdated documentation files (INTEGRATE_NUMPYSPEC_LEANEXPLORE.md, LEANEXPLORE_LOCAL_SETUP.md, etc.) - Remove temporary files and scripts (json_to_lean.py, parse_svg_graph.py, temp_snippet.lean, etc.) - Update .gitignore for better project organization - Update all NumPy specification files across core, datetime, indexing, IO, statistics, strings, and ufuncs modules - Remove temporary test files and development artifacts 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
Files in spec_pipeline/ should be ignored per .gitignore but were already tracked. This removes them from git tracking while keeping local files intact. The .gitignore rule will now work correctly for this directory. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
|
Claude finished @htlou's task —— View job Code Review CompleteReview Checklist
SummaryThis is a substantial and well-structured PR implementing NumPy specifications across 4 major modules with 65+ specification files. The implementation demonstrates good understanding of both Lean 4 and NumPy's mathematical properties. ✅ Strengths
|
|
Claude finished @htlou's task —— View job Code Review CompleteReview Checklist
SummaryThis is a substantial and well-structured PR implementing NumPy specifications across 4 major modules with 65+ specification files. The implementation demonstrates good understanding of both Lean 4 and NumPy's mathematical properties. ✅ Strengths
|
…AI-Foundation/NumpySpec into dev-hantao-all-numpy
…g, typing, and ufunc modules This comprehensive update implements formal Lean 4 specifications for 50+ NumPy functions across 4 major modules: ## Strings Module (29 functions) - Complete specifications for string operations: splitlines, startswith, strip, swapcase, title, translate, upper, zfill, etc. - All functions use Vector-based approach for type safety - Comprehensive mathematical properties and edge case handling ## Testing Module (9 functions) - Assertion functions: assert_allclose, assert_almost_equal, assert_array_equal, etc. - Tolerance-based comparison with proper mathematical specifications - Context managers: suppress_warnings, break_cycles - Documentation testing: rundocs ## Typing Module (9 types) - Precision hierarchy: _8Bit, _16Bit, _32Bit, _64Bit, _128Bit - Type validation: ArrayLike, DTypeLike, NBitBase, NDArray - Full type system support for NumPy compatibility ## Ufunc Module (6 functions) - Universal function operations: __call__, accumulate, at, outer, reduce, reduceat - Mathematical specifications for element-wise and reduction operations - Proper handling of broadcasting and indexing semantics ## Key Features - All specifications use Hoare triple syntax: ⦃⌜precondition⌝⦄ function ⦃⇓result => postcondition⦄ - Vector-based approach throughout for compile-time type safety - Comprehensive mathematical properties and sanity checks - All files compile successfully with lake build - Ready for implementation and formal verification 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
|
Claude finished @htlou's task —— View job Code Review CompleteReview Checklist
SummaryThis is a substantial and well-structured PR implementing NumPy specifications across 4 major modules with 65+ specification files. The implementation demonstrates good understanding of both Lean 4 and NumPy's mathematical properties. ✅ Strengths
|
This commit fixes all compilation errors that were preventing the full project build: ## Key Fixes Applied: 1. **Import Order Issues**: Fixed multiple files where import statements were not at the beginning - NumpySpec/indexing_slicing/trace.lean - moved imports before documentation - NumpySpec/indexing_slicing/tril.lean - moved imports before documentation 2. **Hoare Triple Syntax**: Fixed missing ⌜...⌝ wrapping in postconditions - NumpySpec/indexing_slicing/select.lean - wrapped postcondition in assertion syntax - NumpySpec/indexing_slicing/tril.lean - wrapped postcondition in assertion syntax 3. **Main Module Import**: Fixed incorrect import in Main.lean - Changed from `import NDArray` to `import NumpySpec` 4. **Complex Proof Simplification**: Simplified overly complex proofs that were causing tactic failures - NumpySpec/indexing_slicing/tril.lean - simplified matrix indexing proofs using existential quantification - NumpySpec/indexing_slicing/trace.lean - simplified diagonal sum specification using list-based approach 5. **Type Error Fixes**: Fixed type mismatches in mathematical expressions - NumpySpec/indexing_slicing/trace.lean - corrected tuple destructuring syntax ## Build Status: - ✅ All 768 modules now compile successfully - ✅ Only expected `sorry` warnings remain (implementation placeholders) - ✅ No compilation errors or failures - ✅ Project builds completely with `lake build` The project now has a clean build state with all specifications properly formatted and compilable. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
|
Claude finished @htlou's task —— View job Code Review CompleteReview Checklist
SummaryThis is a substantial and well-structured PR implementing NumPy specifications across 4 major modules with 65+ specification files. The implementation demonstrates good understanding of both Lean 4 and NumPy's mathematical properties. ✅ Strengths
|
| all elements at their corresponding indices. This captures the fundamental property | ||
| of numpy.array - converting sequence-like data into array format while preserving | ||
| element values and order. -/ | ||
| theorem array_spec (data : List Float) : |
There was a problem hiding this comment.
these should all use Array or Vector. but it may be ok bc List has many proofs prewritten
| ⦃⌜True⌝⦄ | ||
| ascontiguousarray a | ||
| ⦃⇓result => ⌜ | ||
| (max n 1 ≥ 1) ∧ |
There was a problem hiding this comment.
when theres a conjunction, put a /- .. -/ comment of the spec for each piece of the conjunction to show WHAT it specifies. like this one would be
| a 1D vector as a matrix type. Since numpy.asmatrix doesn't make a copy | ||
| if the input is already a matrix or ndarray, this function acts as an | ||
| identity operation with matrix type semantics. -/ | ||
| def asmatrix {n : Nat} (data : Vector Float n) : Id (Vector Float n) := |
There was a problem hiding this comment.
i think we should introduce a Matrix a Rows Cols type and sorry it out
| -- 1. Elements on the main diagonal are from v | ||
| (∀ i : Fin n, (result.get i).get i = v.get i) ∧ | ||
|
|
||
| -- 2. All off-diagonal elements are zero |
There was a problem hiding this comment.
this is what i meant by comments for a conjunction, this is a good example
| diagflat v | ||
| ⦃⇓result => ⌜ | ||
| -- Elements on the main diagonal are from the input vector | ||
| (∀ i : Fin n, result.get ⟨i.val * n + i.val, sorry⟩ = v.get i) ∧ |
There was a problem hiding this comment.
shouldnt put 2d to 1d indexing inline, it should be abstracted out somehow
| ∀ k : Fin n, (result.get i).get k = 1.0 → k = j) ∧ | ||
| -- Uniqueness property: exactly one 1.0 in each column | ||
| (∀ j : Fin n, ∃ i : Fin n, (result.get i).get j = 1.0 ∧ | ||
| ∀ k : Fin n, (result.get k).get j = 1.0 → k = i) ∧ |
There was a problem hiding this comment.
im concerned these statements about floats and == 1.0 are basically unprovable
| (h_bounds_start : ∀ i : Fin n, 0 ≤ start.get i ∧ start.get i ≤ (a.get i).length) | ||
| (h_bounds_end : ∀ i : Fin n, 0 ≤ endPos.get i ∧ endPos.get i ≤ (a.get i).length) | ||
| (h_nonempty : ∀ i : Fin n, sub.get i ≠ "") : | ||
| ⦃⌜∀ i : Fin n, start.get i ≤ endPos.get i ∧ |
There was a problem hiding this comment.
the hypothesis and the preconditions are the same
There was a problem hiding this comment.
This is a good point, @htlou we should see if removing them is possible. Not a blocker though, since the preconditions should be trivially fillable by using the named hypotheses. But also not ideal ofc.
| - k ≥ cols: All elements are preserved (entire matrix is "lower triangular") | ||
| - k ≤ -rows: All elements are zeroed (no elements are "on or below" such a diagonal) | ||
| -/ | ||
| theorem tril_spec {rows cols : Nat} (m : Vector (Vector Float cols) rows) (k : Int := 0) : |
There was a problem hiding this comment.
here it identifies a Matrix as a nested Vector. Maybe Matrix should be an opaque type?
|
|
||
| This captures the key property of asanyarray: when given an ndarray (Vector in our case), | ||
| it returns the same array without copying. | ||
| -/ |
There was a problem hiding this comment.
There should be a pass to generalize the types, this could be a Vector α n rather than just Float. Also lower priority.
| ∃ (left_pad right_pad : Nat), | ||
| left_pad + (a.get i).length + right_pad = width.get i ∧ | ||
| -- Padding should be as equal as possible (differ by at most 1) | ||
| (left_pad = right_pad ∨ left_pad = right_pad + 1) ∧ |
There was a problem hiding this comment.
This postcondition is unnecessary; the next postcondition is sufficient to assure it.
| (∀ p ∈ positions, | ||
| Int.natAbs (start.get i) ≤ p ∧ | ||
| p + (sub.get i).length ≤ Int.natAbs (endPos.get i) ∧ | ||
| p + (sub.get i).length ≤ (a.get i).length ∧ |
There was a problem hiding this comment.
this is assured by the precondition: endPos<= a.i.length
| endswith a suffix | ||
| ⦃⇓r => ⌜∀ i : Fin n, | ||
| -- Main specification: result matches String.endsWith for each pair | ||
| (r.get i = (a.get i).endsWith (suffix.get i)) ∧ |
There was a problem hiding this comment.
Why are we checking this, if it is literally what we did in the body.
| ⦃⌜True⌝⦄ | ||
| equal x1 x2 | ||
| ⦃⇓result => ⌜-- Core property: result[i] = (x1[i] == x2[i]) for all indices | ||
| (∀ i : Fin n, result.get i = (x1.get i == x2.get i)) ∧ |
There was a problem hiding this comment.
isn't this enough as a postcondition?
| /-- Specification: expandtabs replaces tab characters with appropriate number of spaces -/ | ||
| theorem expandtabs_spec {n : Nat} (a : Vector String n) (tabsize : Vector Nat n) | ||
| (h_positive : ∀ i : Fin n, tabsize.get i > 0) : | ||
| ⦃⌜∀ i : Fin n, tabsize.get i > 0⌝⦄ |
There was a problem hiding this comment.
Same hypo = pre-con here. Besides that there is an unclear detail of the definition for instance how many tabs could one word have and will each of them be replaced with the same number of spaces
…INE.md Enhanced all code examples with comprehensive documentation comments that explain: - Mathematical formulas and algorithms - Parameter meanings and constraints - Properties captured by specifications - Relationship to NumPy behavior This makes the pipeline documentation more accessible and provides clear guidance for generating well-documented specifications. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
|
Claude finished @htlou's task —— View job Code Review CompleteReview Checklist
SummaryThis is a substantial and well-structured PR implementing NumPy specifications across 4 major modules with 65+ specification files. The implementation demonstrates good understanding of both Lean 4 and NumPy's mathematical properties. ✅ Strengths
|
Summary
Technical Implementation
⦃⌜precondition⌝⦄ function ⦃⇓result => postcondition⦄Test Plan
This completes the systematic specification of 4 major NumPy modules representing core functionality for scientific computing in Lean 4.
🤖 Generated with Claude Code