Skip to content

fix: speculative decoding#53

Merged
SaschaOnTour merged 13 commits intomainfrom
fix/speculative-decoding
Apr 19, 2026
Merged

fix: speculative decoding#53
SaschaOnTour merged 13 commits intomainfrom
fix/speculative-decoding

Conversation

@SaschaOnTour
Copy link
Copy Markdown
Owner

No description provided.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors TurboQuant’s cache internals to support speculative decoding by enabling concurrent per-layer writes, while also reorganizing and de-duplicating test/support code used across integration tests, benches, and examples.

Changes:

  • Replace monolithic per-cache storage with per-layer Mutex-guarded LayerStorage (plus shared StorageMetadata) and update PqoCache/TqCache to use OnceLock<GpuPrecomputed> for lazy init.
  • Split former roundtrip_tests.rs into focused integration test modules and centralize shared deterministic generators in turboquant::test_utils.
  • Bump crate + dependency versions and expand CI quality analysis scope to the whole repo.

Reviewed changes

Copilot reviewed 30 out of 30 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
tests/roundtrip_tests.rs Removed monolithic integration test file (tests redistributed into focused modules).
tests/rotation_tests.rs Extracted rotation/WHT/sign-pattern tests; now uses shared test_utils::pseudo_random_vec.
tests/quantize_roundtrip_tests.rs Extracted quantize/dequantize roundtrip tests; uses shared pseudo-random generator.
tests/packed_tests.rs Extracted packed-format tests; small fixture cleanup (e.g., residual norm constant).
tests/paper_verification_tests.rs Replaced magic numbers with named constants; clarified seeds/tolerances; added rustqual suppressions where intended.
tests/mse_validation.rs Reused shared pseudo_random_vec and factored Box–Muller constants.
tests/inner_product_tests.rs Reused shared LCG constants/generator; minor constant factoring.
tests/codebook_tests.rs Introduced named constants for integration step counts/sample points.
tests/cache_type_correctness.rs Switched to shared make_kv helper and updated usage for &self cache APIs.
tests/cache_storage_tests.rs Updated tests to new LayerStorage/StorageMetadata model and added invariants/growth checks.
tests/cache_pqo_tests.rs Updated to &self cache APIs and shared make_kv; minor cleanup in GPU test module imports.
tests/cache_internals_tests.rs Added direct tests for ensure_gpu_precomputed (OnceLock init behavior).
tests/cache_concurrency_tests.rs Added concurrency stress tests validating per-layer locking behavior under contention/reset.
src/test_utils.rs Expanded shared test utilities (LCG vector + candle-only make_kv), now usable by integration tests/benches/examples.
src/lib.rs Exposed test_utils as #[doc(hidden)] pub mod for cross-target reuse.
src/cache/tq.rs Migrated TqCache to per-layer locking and OnceLock precomputed; introduced TqLayer wrapper.
src/cache/pqo.rs Migrated PqoCache to per-layer locking and OnceLock precomputed; updated CUDA path to use LayerBuffers.
src/cache/storage.rs Replaced CompressedStorage with StorageMetadata + per-layer LayerStorage + grouped LayerBuffers.
src/cache/mod.rs Added ensure_gpu_precomputed helper and re-exported new storage types.
src/cache/common.rs Centralized config validation into validate_and_make_metadata; adapted dequantize and quant-config creation to new types.
src/cache/cuda/attention.rs Minor arithmetic cleanup (div_ceil) for partition computation.
rustqual.toml Updated ignore patterns and expanded allowed magic numbers to match new tests/concurrency fixtures.
examples/kv_cache_demo.rs Reused shared LCG helpers from test_utils instead of duplicating PRNG code.
benches/quantize_bench.rs Reused shared pseudo_random_vec and added rustqual suppressions for criterion idioms.
Cargo.toml Bumped crate version to 0.4.0, added parking_lot, and bumped mistralrs-kv-cache requirement.
CHANGELOG.md Documented breaking changes and new concurrency tests for 0.4.0.
.github/workflows/ci.yml Changed rustqual run target from src/ to repo root (.).
docs/rustqual-bugs.md Added rustqual false-positive writeup (suppression / SRP / TQ_UNTESTED issues).
docs/rustqual-architecture-module-spec.md Added architecture-module design spec (documentation-only).
docs/architecture-proposal-2026-04-18.md Added clean-architecture refactor proposal (documentation-only).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/cache/pqo.rs
Comment thread src/cache/pqo.rs
Comment thread src/cache/pqo.rs
Comment thread src/cache/tq.rs
Comment thread src/cache/tq.rs
Comment thread src/cache/tq.rs Outdated
Comment thread src/cache/mod.rs
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 59 out of 59 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/cache/pqo.rs
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 57 out of 58 changed files in this pull request and generated 4 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread CHANGELOG.md Outdated
Comment thread src/lib.rs Outdated
Comment thread src/test_utils.rs Outdated
Comment thread tests/cache_concurrency_tests.rs Outdated
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 57 out of 58 changed files in this pull request and generated 7 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread Cargo.toml Outdated
Comment thread src/cache/mod.rs Outdated
Comment thread src/cache/storage.rs Outdated
Comment thread src/cache/tq.rs Outdated
Comment thread src/cache/tq.rs Outdated
Comment thread src/cache/pqo.rs Outdated
Comment thread CHANGELOG.md Outdated
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 57 out of 58 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread .github/workflows/ci.yml
Comment thread src/cache/storage.rs
Comment thread src/cache/mod.rs Outdated
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 57 out of 58 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/cache/pqo.rs
@SaschaOnTour SaschaOnTour requested a review from Copilot April 19, 2026 17:04
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 57 out of 58 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread tests/mse_qjl_tests.rs Outdated
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 57 out of 58 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread tests/rotation_tests.rs
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 57 out of 58 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/cache/mod.rs Outdated
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 57 out of 58 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/cache/common.rs Outdated
Comment thread src/cache/common.rs Outdated
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 59 out of 60 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/cache/common.rs
Comment thread src/cache/mod.rs
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 58 out of 59 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@SaschaOnTour SaschaOnTour merged commit 445a240 into main Apr 19, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants