All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
- Two-stage candidate ranking system with
CandidateRankingModeland supporting classes (CandidateGenerator,CandidateFeatureCollector,Reranker,CatBoostReranker,PerUserNegativeSampler) (#296)
- Used
pm-implicitinstead ofimplicitfor Python>=3.10 to support Cuda 12.x (#298)
- LiGR transformer layers from "From Features to Transformers: Redefining Ranking for Scalable Impact" (#295)
- HSTU Model from "Actions Speak Louder then Words..." implemented in the class
HSTUModel(#290) leave_one_out_maskfunction (rectools.models.nn.transformers.utils.leave_one_out_mask) for applying leave-one-out validation during transformer models training.(#292)logits_targument toTransformerLightningModuleBase. It is used to scale logits when computing the loss. (#290)use_scale_factorargument toLearnableInversePositionalEncoding. It scales embeddings by the square root of their dimension — following the original approach from the "Attention Is All You Need" (#290)- Optional
contextargument torecommendmethod of models andget_contextfunction torectools.dataset.context.py(#290)
- [Breaking] Corrected computation of
cosinedistance inDistanceSimilarityModule(#290) - Installation issue with
cupyextra on macOS (#293) torch.dtype object has no attribute 'kind'error inTorchRanker(#293)
- [Breaking]
Dropoutmodule fromIdEmbeddingsItemNet. This changes model behaviour during training, so model results starting from this release might slightly differ from previous RecTools versions even when the random seed is fixed.(#290)
extrasargument toSequenceDataset,extra_colsargument toTransformerDataPreparatorBase,session_tower_forwardanditem_tower_forwardmethods toSimilarityModuleBase(#287)- Support for resaving transformer models multiple times and loading trainer state (#289)
- [Breaking] Now
LastNSplitterguarantees taking the last ordered interaction in dataframe in case of identical timestamps (#288)
- Python 3.13 support (#227)
fit_partialimplementation for transformer-based models (#273)map_locationandmodel_params_updatearguments for the functionload_from_checkpointfor Transformer-based models. Usemap_locationto explicitly specify the computing new device andmodel_params_updateto update original model parameters (e.g. remove training-specific parameters that are not needed anymore) (#281)get_val_mask_func_kwargsandget_trainer_func_kwargsarguments for Transformer-based models to allow keyword arguments in custom functions used for model training. (#280)
TransformerNegativeSamplerBaseandCatalogUniformSamplerclasses,negative_sampler_typeandnegative_sampler_kwargsparameters to transformer-based models (#275)SimilarityModuleBase,DistanceSimilarityModule, similarity module toTransformerTorchBackboneparameters to transformer-based modelssimilarity_module_type,similarity_module_kwargs(#272)out_dimproperty toIdEmbeddingsItemNet,CatFeaturesItemNetandSumOfEmbeddingsConstructor(#276)TransformerBackboneBase,backbone_typeandbackbone_kwargsparameters to transformer-based models (#277)sampled_softmaxloss option for transformer models (#274)
- Interactions extra columns are not dropped in
Dataset.filter_interactionsmethod #267
SASRecModelandBERT4RecModel- models based on transformer architecture (#220)- Transfomers extended theory & practice tutorial, advanced training guide and customization guide (#220)
use_gpufor PureSVD (#229)from_paramsmethod for models andmodel_from_paramsfunction (#252)TorchRankerranker which calculates scores using torch. Supports GPU. #251Rankerranker protocol which unify rankers call. #251
ImplicitRankerrankmethod compatible withRankerprotocol.use_gpuandnum_threadsparams moved fromrankmethod to__init__. #251
ImplicitBPRWrapperModelmodel with algorithm description in extended baselines tutorial (#232, #239)- All vector models and
EASEModelsupport for enabling ranking on GPU and selecting number of threads for CPU ranking. Addedrecommend_n_threadsandrecommend_use_gpu_rankingparameters toEASEModel,ImplicitALSWrapperModel,ImplicitBPRWrapperModel,PureSVDModelandDSSMModel. Addedrecommend_use_gpu_rankingtoLightFMWrapperModel. GPU and CPU ranking may provide different ordering of items with identical scores in recommendation table, so this could change ordering items in recommendations since GPU ranking is now used as a default one. (#218)
from_config,get_configandget_paramsmethods to all models except neural-net-based (#170)fit_partialimplementation forImplicitALSWrapperModelandLightFMWrapperModel(#203, #210, #223)saveandloadmethods to all of the models (#206)- Model configs example (#207,#219)
use_gpuargument toImplicitRanker.rankmethod (#201)keep_extra_colsargument toDataset.constructandInteractions.from_rawmethods.include_extra_colsargument toDataset.get_raw_interactionsandInteractions.to_externalmethods (#208)- dtype adjustment to
recommend,recommend_to_itemsmethods ofModelBase(#211) load_modelfunction (#213)model_from_configfunction (#214)get_cat_featuresmethod toSparseFeatures(#221)- LightFM Python 3.12+ support (#224)
- Implicit ALS matrix zero assignment size (#228)
- Python 3.8 support (#222)
Debiasmechanism for classification, ranking and auc metrics. New parameteris_debiasedtocalc_from_confusion_df,calc_per_user_from_confusion_dfmethods of classification metrics,calc_from_fitted,calc_per_user_from_fittedmethods of auc and rankning (MAP) metrics,calc_from_merged,calc_per_user_from_mergedmethods of ranking (NDCG,MRR) metrics. (#152)nbformat >= 4.2.0dependency to[visuals]extra (#169)filter_interactionsmethod ofDataset(#177)on_unsupported_targetsparameter torecommendandrecommend_to_itemsmodel methods (#177)- Use nmslib-metabrainz for Python 3.11 and upper (#180)
display()method inMetricsApp(#169)IntraListDiversitymetric computation incross_validate(#177)- Allow warp-kos loss for LightFMWrapperModel (#175)
- [Breaking]
assume_external_idsparameter inrecommendandrecommend_to_itemsmodel methods (#177)
- Extended Theory&Practice RecSys baselines tutorial (#139)
MetricsAppto create plotly scatterplot widgets for metric-to-metric trade-off analysis (#140, #154)Intersectionmetric (#148)PartialAUCandPAPmetrics (#149)- New params (
tol,maxiter,random_state) to thePureSVDmodel (#130) - Recommendations data quality metrics:
SufficientReco,UnrepeatedReco,CoveredUsers(#155) r_precisionparameter toPrecisionmetric (#155)
- Used
rectools-lightfminstead of purelightfmthat allowed to install it usingpoetry>=1.5.0(#165) - Added restriction to
pytorchversion for MacOSX + x86_64 that allows to install it on such platforms (#142) PopularInCategoryModelfitting for multiple times,cross_validatecompatibility, behaviour with empty category interactions (#163)
- Warm users/items support in
Dataset(#77) - Warm and cold users/items support in
ModelBaseand all possible models (#77, #120, #122) - Warm and cold users/items support in
cross_validate(#77) - [Breaking] Default value for train dataset type and params for user and item dataset types in
DSSMModel(#122) - [Breaking]
n_factorsanddeterministicparams toDSSMModel(#122) - Hit Rate metric (#124)
- Python
3.11support (withoutnmslib) (#126) - Python
3.12support (withoutnmslibandlightfm) (#126)
- Changed the logic of choosing random sampler for
RandomModeland increased the sampling speed (#120) - [Breaking] Changed the logic of
RandomModel: now the recommendations are different for repeated calls of recommend methods (#120) - Torch datasets to support warm recommendations (#122)
- [Breaking] Replaced
include_warmparameter inDataset.get_user_item_matrixto pairinclude_warm_usersandinclude_warm_items(#122) - [Breaking] Renamed torch datasets and
dataset_typetotrain_dataset_typeparam inDSSMModel(#122) - [Breaking] Updated minimum versions of
numpy,scipy,pandas,typeguard(#126) - [Breaking] Set restriction
scipy < 1.13(#126)
- [Breaking]
return_external_idsparameter inrecommendandrecommend_to_itemsmodel methods (#77) - [Breaking] Python
3.7support (#126)
VisualAppandItemToItemVisualAppwidgets for visual comparison of recommendations (#80, #82, #85, #115)- Methods for conversion
Interactionsto raw form and for getting raw interactions fromDataset(#69) AvgRecPopularity (Average Recommendation Popularity)tometrics(#81)- Added
normalizedparameter toAvgRecPopularitymetric (#89) - Added
EASEmodel (#107)
- Loosened
pandas,torchandtorch-lightversions forpython >= 3.8(#58)
- Bug in
Interactions.from_rawmethod (#58) - Mistakes in formulas for Serendipity and MIUF in docstrings (#115)
- Examples reproducibility on Google Colab (#115)
- Ability to pass internal ids to
recommendandrecommend_to_itemsmethods and get internal ids back (#70) rectools.model_selection.cross_validatefunction (#71, #73)
- Loosened
lightfmversion, now it's possible to use 1.16 and 1.17 (#72)
- Small bug in
LastNSplitterwith incorrecti_splitin info (#70)
- Updated attrs version (#56)
- Optimized inference for vector models with EUCLIDEAN distance using
implicitlibrary topk method (#57) - Changed features processing example (#60)
MRR (Mean Reciprocal Rank)tometrics(#29)F1beta,MCC (Matthew correlation coefficient)tometrics(#32)- Base
Splitterclass to construct data splitters (#31) RandomSplittertomodel_selection(#31)LastNSplittertomodel_selection(#33)- Support for
Python 3.10(#47)
- Bumped
implicitversion to0.7.1(#45) - Bumped
lightfmversion to1.17(#43) - Bumped
pylintversion to2.17.6(#43) - Moved
nmslibfrom main dependencies to extras (#36) - Moved
lightfmto extras (#51) - Renamed
nnextra totorch(#51) - Optimized inference for vector models with COSINE and DOT distances using
implicitlibrary topk method (#52) - Changed initialization of
TimeRangeSplitter(instead ofdate_rangeargument, usetest_sizeandn_splits) (#53) - Changed split infos key names in splitters (#53)
- Bugs with new version of
pytorch_lightning(#43) pylintconfig for new version (#43)- Cyclic imports (#45)
Markdowndependancy (#54)