Releases: lenskit/lkpy
Lots of performance improvements, mostly
This version includes a lot of performance improvements along with some ergonomic advances to logging, better exports, and some new metrics. Along with a handful of bug fixes. And Amazon data set import.
2025.3.0 is coming soon.
What's Changed
- Add versions to dataset schemas by @mdekstrand in #721
- Add SparseRow extension type for CSRs in Arrow by @mdekstrand in #722
- Use Rust to accelerate negative sampling by @mdekstrand in #723
- Re-add progress bars and signal checking to Item KNN builder by @mdekstrand in #724
- Add ndarray dep and use it for sampling by @mdekstrand in #725
- Use Maturin to simplify build setup by @mdekstrand in #726
- Reduce base structure memory consumption by @mdekstrand in #728
- Add Rust acceleration to User KNN by @mdekstrand in #730
- Rewrite ALS solvers in Rust by @mdekstrand in #731
- Rewrite FunkSVD in Rust by @mdekstrand in #732
- Build with ABI3 and maturin-action by @mdekstrand in #733
- Add SPEC-1 (Lazy Loading) and incorporate version numbers into docs by @mdekstrand in #736
- Implement is_sorted for matrix tables in Rust by @mdekstrand in #737
- Correctly detect no-genre movies in MovieLens by @mdekstrand in #738
- Adding Mean Average Precision by @albus-droid in #739
- Add tests to avoid overflow in softmax stochastic sampling by @mdekstrand in #742
- Add configurable weighting models by @mdekstrand in #744
- Parallelize item-item scoring by @mdekstrand in #745
- Support comma-separated lists for uniform nested parallelism configuration by @mdekstrand in #746
- Test and fix crash on Linux ARM by @mdekstrand in #748
- Update ALS solve to zero rows for no-info users/items by @mdekstrand in #749
- Add importers for UCSD Amazon data by @mdekstrand in #751
- refactor FlexMF with less string compares by @mdekstrand in #752
- Add sklearn NMF as a scorer by @FroggoLight in #741
- Speed up ItemList isin checks and evaluation metrics by @mdekstrand in #753
- Speed up ItemList indexing by @mdekstrand in #754
- Add Gini coefficient metrics by @mdekstrand in #755
- Speed up weight computations by @mdekstrand in #756
- Speed up weighted metrics by @mdekstrand in #757
- Add ItemList.top_n and speed up ItemList operations by @mdekstrand in #758
- Add subprocess progress reporting by @mdekstrand in #759
New Contributors
- @albus-droid made their first contribution in #739
- @FroggoLight made their first contribution in #741
Full Changelog: v2025.3.0a2.post2...v2025.3.0a3
Fix 2025.3.0a2 again
Now that I've found the problem, let's fix the release!
Fix 2025.3.0a2
This is just to fix a release problem in 2025.3.0a2.
Preview: FlexMF and better Ray
This rolls up some changes and incoming improvements for preview release, as we are preparing for the 2025.3 release.
What's Changed
- Simplify test workfow by @mdekstrand in #666
- Add StochasticTopNSampler to be correct version of "softmax" sampler by @mdekstrand in #667
- Add flexible Torch matrix factorizer by @mdekstrand in #668
- Export configuration classes from lenskit.basic by @mdekstrand in #672
- Improve logging in Ray cluster setups by @mdekstrand in #673
- Limit test users to only have users who had item in the train data. by @sushobhan2024 in #676
- Add train and recommend commands to the CLI by @mdekstrand in #677
- Improve NumPy -> Torch conversions for read-only data by @mdekstrand in #680
- Reduce excess matrix manipulations in item-KNN by @mdekstrand in #678
- Improve progress bar update API by @mdekstrand in #681
- Update FlexMF parameters to better defaults by @mdekstrand in #682
- Run Doctor under coverage by @mdekstrand in #683
- Use uv for development environments instead of pixi by @mdekstrand in #700
- Add lenskit.state module and move the ParameterContainer interface by @mdekstrand in #702
- Create and use separate ModelTrainer objects by @mdekstrand in #701
- Support training checkpointing in ALS by @mdekstrand in #704
- Add CPU & GPU groups by @mdekstrand in #706
- Improve logging and reduce sorting in dataset relationships by @mdekstrand in #707
- Improve debug logging and add xopen to CLI by @mdekstrand in #708
- Introduce WARP loss to FlexMF by @mdekstrand in #709
- Build with setuptools instead of hatch by @mdekstrand in #710
- Add better compression support to CLI by @mdekstrand in #712
- Update pre-commit hook versions by @mdekstrand in #713
New Contributors
- @sushobhan2024 made their first contribution in #676
Full Changelog: v2025.2.0...v2025.3.0a1
Now with Rust
This adds Rust-based acceleration to item KNN, and more acceleration will be coming in the 2025.3.0 release.
This is the first attempt to publish binary wheels to PyPI, so the exact release may break.
What's Changed
- Add Rust extension infrastructure and accelerate ItemKNN by @mdekstrand in #715
- Replace just with invoke for development tasks by @mdekstrand in #716
Full Changelog: v2025.3.0a1...v2025.3.0a2
2025 Feature Update
A few small feature updates for LensKit 2025.
What's Changed
- Support auto-detecting key columns in ItemListCollection.from_df by @mdekstrand in #659
- Support dataframe-format test data for batch recommendation by @mdekstrand in #660
- Add PipelineCache to allow pipeline builders to cache component instances by @mdekstrand in #661
- Only warn once for users with missing test items in analysis by @mdekstrand in #664
Full Changelog: v2025.1.1...v2025.2.0
LensKit 2025
This is the first release in the new LensKit series, 2025.1.1!
LensKit 2025.1.1 brings a new design to LensKit, with a new generation of APIs that will enable better future flexibility and capability, and make it a lot easier to see the various software capabilities. It will also be easier to add new capabilities, such as content-based and knowledge-based recommenders.
We have plans for a lot of great new things on top of this new foundation, but code written for LensKit 0.14 and earlier will need to be udpated. See the migration guide for details.
Better logging
Improved logging and metrics — almost there!
What's Changed
- Add meaningful error on duplicate metrics by @mdekstrand in #653
- Fix logic for monitors in nested workers by @mdekstrand in #654
Full Changelog: v2025.1.1rc4...v2025.1.1rc5
Move around util and clean up
Not super happy with this being in an RC series, but this removes some old util code, keeping a deprecated shim for a bit, and adds the lenskit doctor command, along with multiple negatives in negative sampling.
Fix ItemList.from_arrow
This fixes a bug in ItemList.from_arrow when a column has null values.