Releases: sb-ai-lab/RePlay
v0.21.4
RePlay 0.21.4 Release notes
Release date: 2026-03-02 · Version: 0.21.4 · Type: Patch
- Bug fixes
Bug fixes
- Fixed the
SequenceEncodingRuletransform method onPySpark. Previously, if there were empty arrays, they were dropped from the dataframe, now they remain empty after the encoding. - The behavior of
ComputeMetricsCallbackin case of limitation of the number of batches through thelightning.Trainerhas been fixed. Previously, the metrics were not logged into the trainer and, accordingly, were not printed.
v0.21.3
RePlay 0.21.3 Release notes
Release date: 2026-02-26 · Version: 0.21.3 · Type: Patch
- New features
- Bug fixes
New features
-
Added new transforms:
SelectTransformfor keeping only required fields in a batch to reduce GPU memory usage.EqualityMaskTransformfor applying a feature-based mask to the existing boolean mask.
Bug fixes
-
ParquetModuleand Transforms:- Enabled passing additional performance-related parameters to
ParquetDatasetviaParquetModule. - Fixed
ParquetModule(moved transforms to the device) and corrected transforms used for negative sampling that previously degraded performance. - Standardized transform parameter naming. Previously, parameters were inconsistently named (
name,field,column); nownameis used consistently across all transforms.
- Enabled passing additional performance-related parameters to
-
NN models:
- Removed the
item_tower_feature_namesparameter fromTwoTower. This parameter was redundant; item tower column names can now be obtained fromFeaturesReader. - Fixed embedding initialization. The padding row in categorical embeddings is now always initialized to zero.
- Removed the
-
Callbacks:
- Fixed
TopItemsCallbackBasebehavior. Previously, the callback relied on a hard-coded column name to read user IDs. The column name can now be specified by the user and is passed to the constructor. - Fixed
ComputeMetricsCallback. When multiple metric callbacks were used, identical metrics were returned multiple times.
- Fixed
v0.21.2
RePlay 0.21.2 Release notes
Release date: 2026-02-11 · Version: 0.21.2 · Type: Patch
- Bug fixes
Bug fixes
-
Fixed numerical instability of models in block architecture (
replay.nn.sequential.SasRec,replay.nn.sequential.TwoTower) with conversion to ONNX.
The padding value of the floating point attention mask during training remains equal to -torch.inf, during inference it will be replaced by the smallest finite number of torch.float32. -
Convert key_padding_mask in
torch.nn.MultiHeadAttentionfrom bool type to float type to resolve PyTorch warning about using same types of padding mask and attention mask.
v0.21.1
RePlay 0.21.1 Release notes
Release date: 2026-02-05 · Version: 0.21.1 · Type: Patch
- Bug fixes
Bug fixes
- Fixed functions for creating default sets of batch transforms
make_default_twotower_transformsandmake_default_sasrec_transforms. Now these functions create transforms compatible with models that use all features from theTensorSchemaobject, and not just for the identifier of items. - Fixed compatibility of
ComputeMetricsCallbackand multiple dataloaders.
v0.21.0
RePlay 0.21.0 Release notes
Table of Contents
- Table of Contents
- Release Notes
- Highlights
- Deprecations
- New Features
- Improvements
- Bug Fixes
- Migration Notes
- References
Release Notes
Release date: 2026-01-30 · Version: 0.21.0 · Type: Minor
Highlights
This release introduces a redesigned neural network architecture and a new data processing pipeline,
bringing improved scalability, flexibility, and transparency to model training workflow.
Key benefits:
- Train on significantly large-scale datasets with lower memory usage thanks to batch-wise data loading without full in-memory loading
- Customize data preprocessing more easily with composable batch-level transforms and no hidden logic
- Build and extend models flexibly without upgrading the library using a block-based architecture with reusable components and easily adding custom blocks when needed
- Experiment faster by decoupling blocks of models
- Adopt new architectures incrementally, while existing pipelines continue to work
Legacy APIs continue to work, but backwards incompatible changes are planned for upcoming releases.
See the Deprecations section for details.
This release is fully backward-compatible, and existing pipelines will continue to work. It lays the foundation for future model and pipeline extensions.
Deprecations
Legacy Data Pipeline APIs
The following APIs are deprecated but continue to work in this release.
They will be removed in upcoming releases.
-
Deprecated modules
The previous multi-stage data pipeline for neural network workflow has been deprecated, including:
SequentialDatasetSequenceTokenizerSasRecTrainingDataset/SasRecValidationDataset/SasRecPredictionDatasetSasRecTrainingBatch/SasRecValidationBatch/SasRecPredictionBatch
-
Replacement
A new data pipeline is introduced (see ParquetModule for details). The new pipeline provides greater flexibility and enables training on significantly larger datasets by avoiding out-of-memory (OOM) issues.
Model API Changes
- SASRec APIs have been redesigned in a block-based architecture.
- Lightning-specific wrappers are no longer model-specific.
- Models no longer encapsulate loss computation internally.
- Deprecated modules
SasRecSasRecModel
Action Required
It's recommended to migrate custom datasets and preprocessing logic to the new pipeline.
New Features
ParquetModule
ParquetModule is the core building block of the new data pipeline.
-
Key Features
- Automatic padding and sequence truncation based on a provided schema
- Batch-wise reading and processing, enabling efficient work with large datasets in memory-constrained environments as it avoids loading the full dataset into memory.
- Full compatibility with PyTorch Lightning Trainers.
- Built-in support for multiple dataloaders for validation, testing and prediction
- Built-in support for PyTorch Distributed Data Parallel (DDP)
- Batch-level transforms can be easily composed into custom preprocessing pipelines,
and extended with user-defined transforms when needed
-
Pipelines comparison
LEGACY PIPELINE NEW PIPELINE ─────────────── ──────────── Raw Data Raw Data ↓ ↓ Filtering/Splitting User-defined CPU Preprocessing ↓ ├─ Filtering Dataset ├─ Splitting ↓ ├─ Tokenizing SequenceTokenizer └─ Grouping ├─ Tokenizing ↓ └─ Grouping ParquetModule ↓ (GPU batch-level preprocessing) ├─ SasRecTrainingDataset ├─ Padding ├─ Padding ├─ Shifting ├─ Shifting └─ Negative Sampling └─ Negative Sampling ↓ ├─ SasRecValidationDataset Model Forward Pass └─ SasRecPredictDataset ↓ torch.utils.data.DataLoader ↓ Model Forward Pass -
Details
Data may be prepared manually using any data processing framework, for example, Pandas, Polars, PySpark, PyArrow. It is required to save the data in Parquet format. Pay attention to partition sizes during saving the data - it is recommended to be 256 - 512 Mb.
A
ParquetModuleinstance is created by:- specifying metadata including shape and padding for each data split
- specifying a list of batch-level transforms for each data split
ParquetModulereads data batch-by-batch and applies split-specific transforms
immediately before feeding the batch into the model.
This design ensures scalability and efficient memory usage.
Neural Network Architecture Redesign
Neural network models follow a block-based architecture, where:
- Models receive pre-built component instances instead of raw configuration parameters
- A single unified Lightning wrapper is shared across all models
- Core components (losses, embedders, heads, etc.) are implemented as reusable modules
All reusable building blocks are located under the replay/nn module.
The new architecture is currently implemented for SASRec and TwoTower.
Support for BERT4Rec will be added in a future release.
NN Losses and training
- Loss computation is decoupled from NN models and is provided as standalone blocks within the block-based architecture
LogInCE,LogInCESampledandLogOutCEloss functions are added- Sampled losses (
BCESampled,CESampled,LogInCESampled) support per-sample weighting via batch-provided weights - Support for negative labels per batch element is added for sampled losses.
MultiClassNegativeSamplingTransformallows negative items selection from different item catalogue subsets.
New Model: TwoTower
- Introduced a new
TwoTowermodel implemented using the block-based architecture - Fully compatible with the new data pipeline and NN workflow
Improvements
- Optimization of
SequenceEncodingRule. The achieved acceleration on large datasets (100+ million rows) in an industrial pipeline is up to 15+ times.
Bug Fixes
- Fixed
SasRecCompiledandBert4RecCompiledcompilation issues withtorch >= 2.9.0
Migration Notes
- Migration to the new parquet-based pipeline will be required for existing datasets and preprocessing pipelines in upcoming releases
- Custom preprocessing logic from
SasRecTrainingDatasetand related classes
should be reimplemented using batch-level transforms - Existing models (for example,
replay.models.nn.sequential.SasRec) that were trained in earlier versions of the library are deprecated but still available.
However, checkpoints produced by the legacy SasRec cannot be loaded into a redesigned SasRec. - For the redesigned
SasRecand newTwoTower, the functionality for compiling models in ONNX format and further compilation via OpenVINO is not yet available. We will implement this functionality in the next release.
But for now, you can use theto_onnx()method fromLightningModuleor apply thetorch.onnx.export()method to the model itself. - NN Transformers (
SasRecandBERT4Rec) now use dict-based batches instead of NamedTuple
References
For detailed usage examples and documentation, see the links.
- API documentation:
- Examples:
v0.20.3
RePlay 0.20.3 Release notes
- Bug fixes
Bug fixes
Fixed compability with Pandas < 2.0.0. Previously fixed saving/loading of SequentialDataset abstractions was incompatible with Pandas < 2.0.0.
v0.20.2
RePlay 0.20.2 Release notes
- Bug fixes
Bug fixes
Extended the dependency version on SciPy from ">=1.13.1,<1.14" to ">=1.8.1,<2.0.0".
Shortened the dependency version for PyTorch from "<3.0.0" to "<2.9.0". Due to the inability to use compiled models from the replay.models.nn.sequential.compiled module with PyTorch versions 2.9.0 and higher.
v0.20.1
RePlay 0.20.1 Release notes
- Bug fixes
Bug fixes
Fixed saving of SequentialDataset abstractions. Previously, while maintaining abstraction, the dataset contained inside was saved in json format. The dataset is now saved in parquet format.
Fixed the processing of model scores inside the RemoveSeenItems postprocessor. Previously, processing affected the initial scores. Now the initial scores do not change during postprocessing.
v0.20.0
RePlay 0.20.0 Release notes
- Highlights
- Backwards Incompatible Changes
- New Features
- Improvements
- Bug fixes
Highlights
We are excited to announce the release of RePlay 0.20.0!
In this update, we added Python 3.12 support, removed Python 3.8 support, conditional imports, minimized the number of dependencies to install, and updated the list of extra dependencies.
Backwards Incompatible Changes
The release did not break backward compatibility in terms of code. But backward compatibility is broken in terms of library dependencies. For example, if you trained a model using RePlay version 0.19.0, then you can easily load the weights of this model in current release, but you will have to update the list of dependencies.
New Features
Python 3.12 support and discontinuation of support for Python 3.8
We keep up with the times and understand the importance of new technologies - they bring new opportunities in increasing performance and scaling solutions. Therefore, we are pleased to announce that the release fully supports Python 3.12.
In addition, the library is discontinuing support for Python 3.8.
New version of dependencies and Conditional imports
The library is used in several modes - research and industrial solutions. In industrial solutions, it is very important to meet the requirements of performance, the size of docker images and, as a result, the number of dependencies.
We understand that the library must be very flexible for use in all modes. Therefore, we have updated the list of dependencies to the minimum required to install the core version of the RePlay. Now, in order to use the library-specific functionality, the user must install the necessary dependencies themselves.
Dependencies on optuna, nmslib, and hnwsib have been removed from the core version of the library. If necessary, install the following packages yourself:
optuna- to optimize the parameters of the non-neural modelsnmslib,hnswlib- to use theANNalgorithmtorch,lightning- for using neural models. Please note that you can install these dependencies via extra[torch]pyspark- for processing large amounts of data. Please note that you can install these dependencies via extra[spark].
We check that there are versions of dependencies that enable the full functionality of the library.
Improvements
Updating the list of extra dependencies
The release removes the possibility of installing with extra torch-openvino and all. In other words, you will no longer be able to do:
pip install replay-rec[torch-openvino]
# or
pip install replay-rec[all]The release only supports installation with extra torch and spark.
Note: if you are installing a library with an extra torch and want the torch to be CPU-only, then you need to add an extra index --extra-index-url https://download.pytorch.org/whl/cpu.
If you want to install a library with both extras, then just list them separated by commas:
pip install replay-rec[torch, spark]Bug fixes
[Experimental] Adapting the DDPG algorithm to versions of torch 2.6.0 and higher
v0.19.0
RePlay 0.19.0 Release notes
- Highlights
- Backwards Incompatible Changes
- New Features
- Improvements
- Bug fixes
Highlights
In this release, we have added ScalableCrossEntropyLoss and ConsecutiveDuplicatesFilter. This release brings a lot of improvements and bug fixes - see the respective sections!
Backwards Incompatible Changes
This release entails changes that are not backward compatible with previous versions of RePLay. We have changed the architecture of Bert4Rec model in order to speed up, so in this release you will not be able to load the weights of the model trained using previous versions.
New Features
ScalableCrossEntropyLoss for SasRec model
We added ScalableCrossEntropyLoss - new and innovative approximation of CrossEntropyLoss aimed at solving the problem of GPU memory lack when learning on a large item catalogs. Reference article may be found at https://arxiv.org/pdf/2409.18721.
ConsecutiveDuplicatesFilter
We added a new filter - ConsecutiveDuplicatesFilter - that allows to remove duplacate interactions from sequential datasets.
Improvements
SequenceEncodingRule speedup on PySpark
We accelerated transform() method of SequenceEncodingRule when applying it to PySpark dataframes.
Updating the maximum supported version of PyTorch
We updated maximum supported version of PyTorch, so now it is possible to install RePlay with PyTorch < 3.0.0.
Speedup sequential models
Firstly, we replaced self-made LayerNorm and GELU layers in Bert4Rec for PyTorch built-in implementations. Secondly, we added CE_restricted loss for Bert4Rec that works like CrossEntropyLoss, but uses some features of the Bert4Rec archtecture to speed up calculations (sparsification - the limitation of masks, based on the tokens that will be predicted). Thidrly, we replaced some computationally inefficient operations for faster analogues in SasRec and Bert4Rec.
Bug fixes
Fix error with accessing object fields in TensorSchema
We fixed an issue when it was not possible to train a sequential model when Hydra and MlFlows are installed with RePlay. It was caused by accessing object fields using wrong names in TensorSchema.
Fix unexpected type casts in LabelEncodingRule with Pandas.
We detected unexpected type casts in transform() method when using Pandas dataframes with LabelEncodingRule and fixed this behaviour.
Fix bugs in Surprisal metric calculation
We fixed incorrect Surprisal behavior with cold items on Polars and missing users on Pandas.