Skip to content

Conversation

@f64u
Copy link

@f64u f64u commented Nov 23, 2025

This PR adds fold operations and container-style operations for BitVec, following patterns from List and Array. This includes fold operations (foldr, foldl, foldrIdx, foldlIdx with cons theorems), conversion operations (ofFn, toList, toArray, toVector), counting operations (popcount, zerocount, countP, countIdxP), transformation operations (map, mapIdx, zipWith), and distance/similarity operations (dot, hammingDist, parity).

Related Zulip discussion: https://leanprover.zulipchat.com/#narrow/channel/217875-Is-there-code-for-X.3F/topic/Why.20isn't.20there.20code.20for.20.60BitVec.2Efoldr.60.3F

New modules

  • Init.Data.BitVec.Folds (additions): Added foldrIdx_nil, foldrIdx_cons, foldlIdx_nil, foldlIdx_cons theorems for indexed folds
  • Init.Data.BitVec.OfFn: Conversion operations (ofFn, toList, toArray, toVector, ofArray, ofVector)
  • Init.Data.BitVec.Count: Counting operations (popcount, zerocount, countP, countIdxP)
  • Init.Data.BitVec.Map: Transformation operations (map, mapIdx, zipWith)
  • Init.Data.BitVec.Hamming: Distance and similarity operations (dot, hammingDist, parity)

Implementation details

  • All operations include theorems and lemmas
  • Added induction principle for BitVec treating it as a cons-based structure
  • Added cons theorems for indexed fold operations (foldrIdx, foldlIdx)
  • Test coverage in tests/lean/run/bitvec_container.lean (60+ test cases)

Notes

  • I'm open to working on golfing or profiling some of the proofs, but wanted to get your opinion on whether this is all worth adding to stdlib
  • Perhaps some modules like Hamming.lean should go to mathlib4 instead?

@github-actions github-actions bot added the toolchain-available A toolchain is available for this PR, at leanprover/lean4-pr-releases:pr-release-NNNN label Nov 23, 2025
@leanprover-community-bot
Copy link
Collaborator

leanprover-community-bot commented Nov 23, 2025

Mathlib CI status (docs):

  • ❗ Batteries/Mathlib CI will not be attempted unless your PR branches off the nightly-with-mathlib branch. Try git rebase e39894e62d363322fa18af76e12e54799496e2c0 --onto 41356740210745d6b9735cb779fbd0f9eee591e5. You can force Mathlib CI using the force-mathlib-ci label. (2025-11-23 23:01:20)
  • ❗ Batteries/Mathlib CI will not be attempted unless your PR branches off the nightly-with-mathlib branch. Try git rebase e39894e62d363322fa18af76e12e54799496e2c0 --onto f2e191d0afbb5adcc6f12e1779c8b141fc9af996. You can force Mathlib CI using the force-mathlib-ci label. (2025-11-24 21:14:16)
  • ❗ Batteries/Mathlib CI will not be attempted unless your PR branches off the nightly-with-mathlib branch. Try git rebase e39894e62d363322fa18af76e12e54799496e2c0 --onto b0e6db322421ce368999f08870fd747e45242f86. You can force Mathlib CI using the force-mathlib-ci label. (2025-11-27 07:13:34)

@leanprover-bot
Copy link
Collaborator

Reference manual CI status:

  • ❗ Reference manual CI will not be attempted unless your PR branches off the nightly-with-manual branch. Try git rebase e39894e62d363322fa18af76e12e54799496e2c0 --onto d3dda9f6d4428a906c096067ecb75e432afc4615. You can force reference manual CI using the force-manual-ci label. (2025-11-23 23:01:21)

@linesthatinterlace
Copy link
Contributor

I think you should have reverse induction (inducting from the least significant bit) as well. This would use BitVec.concat.

@f64u
Copy link
Author

f64u commented Nov 24, 2025

@linesthatinterlace That's a good idea! I'm working on adding it at the moment. But do you mean *from the most significant bit?

@linesthatinterlace
Copy link
Contributor

No, I originally said that, but I think actually cons is defined from the most significant bit because BitVec generally takes a big-endian approach.

@f64u
Copy link
Author

f64u commented Nov 24, 2025

/-- Append a single bit to the end of a bitvector, using big endian order (see `append`).
    That is, the new bit is the least significant bit. -/
@[expose]
def concat {n} (msbs : BitVec n) (lsb : Bool) : BitVec (n+1) := msbs ++ (ofBool lsb)

// ...

/--
Prepends a single bit to the front of a bitvector, using big-endian order (see `append`).

The new bit is the most significant bit.
-/
@[expose]
def cons {n} (msb : Bool) (lsbs : BitVec n) : BitVec (n+1) :=
  ((ofBool msb) ++ lsbs).cast (Nat.add_comm ..)

If for concat the new bit becomes the LSB, then we started with (processed first) the MSBs, and vice versa for cons?

@f64u f64u marked this pull request as ready for review November 24, 2025 21:40
@f64u f64u requested a review from kim-em as a code owner November 24, 2025 21:40
@f64u
Copy link
Author

f64u commented Nov 27, 2025

I was able to get a native popcount implementation that uses either compiler builtins or GMP (or naive C++), depending on size (and GMP availability). I did some benchmarking with and without this implementation, and for smaller vectors (64 bits), the difference is 10x improvment (2056ns vs 240ns avg execution), but for larger vectors the difference is even more noticeable (205x, 50531ns vs 246ns).

I'm also by no means a C++ expert, so any feedback is welcome!

This commit introduces a native C++ implementation for `BitVec.popcount`
to significantly improve its performance, especially on large bitvectors.

- The `mpz` class is extended with a `popcount` method.
- A new extern function `lean_bitvec_popcount` is implemented in the
  runtime. It uses compiler intrinsics for hardware popcount instructions
  (e.g., `__builtin_popcountll`, `__popcnt64`) when available, and
  gracefully falls back to a generic implementation on other platforms.
- `BitVec.zerocount` is refactored to be a cheap calculation based on
  the now-fast `popcount`, rather than a separate fold.
@leanprover-bot leanprover-bot added the P-low We are not planning to work on this issue label Dec 1, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

P-low We are not planning to work on this issue toolchain-available A toolchain is available for this PR, at leanprover/lean4-pr-releases:pr-release-NNNN

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants