EnsRec: Ensembling ID + Text Sequence Encoders for Sequential Recommendation

⚡ Overview

This repository reproduces the experiments in the paper: "Exploiting ID-Text Complementarity via Ensembling for Sequential Recommendation". This includes an implementation of EnsRec, our simple ID + Text ensembling strategy for sequential recommendation.

📦 Installation

Prerequisites

Python 3.10+
CUDA-compatible GPU (recommended)

Setup Environment

# clone project
git clone ...
cd ensrec

# create conda environment
conda create -n ensrec python=3.11
conda activate ensrec

# install requirements
pip install -r requirements.txt --extra-index-url https://download.pytorch.org/whl/cu124

🗂️ Data preparation

Please download the preprocessed data from this Google Drive link. We have followed the same preprocessing as in LIGER. Please unzip this folder and put it in the data/ folder, structured as follows:

data/
├── beauty/         # Beauty from Amazon Reviews 2018
    ├── training/       # training sequence of user history 
    ├── evaluation/     # validation sequence of user history 
    ├── testing/        # testing sequence of user history 
    └── items/          # text of all items in the dataset
├── sports/         # Sports from Amazon Reviews 2018
├── toys/           # Toys from Amazon Reviews 2018
└── steam/          # Steam

All folders beauty/, sports/, toys/ and steam/ should have training/, evaluation/, testing/ and items/ subfolders.

Text embedding generation

Please run notebooks/embedding_gen.ipynb to generate item text embeddings for each dataset. We use SentenceT5-XXL by default.

After completing this, the data is ready for experimentation.

🚀 How to run

To run EnsRec

bash scripts/run_ensrec.sh --method_1=id_only --method_2=text_only --dataset=beauty --seed=42

This will first separately train ID-Only and Text-Only models and save their test user and item embeddings. Then, it will run the notebook notebooks/test_ensrec.ipynb to evaluate test recommendation performance, including complementarity statistics.

You can also run a slightly modified version of EnsRec that trains ID-Only and Text-Only models in the same experiment run. Note that this is different from the EnsRec result we report since it early stops on the ensemble validation performance, instead of independently early stopping the ID-Only and Text-Only models based on their own validation performances (and only ensembling them for testing). To run this version:

bash scripts/run_ensrec.sh --method_1=id_only --method_2=text_only --dataset=beauty --seed=42 --one_job

To reproduce the complementarity results in Table 2, method_1 and method_2 shoud be changed to any variant of ID-Only and/or Text-Only methods, e.g.

bash scripts/run_ensrec.sh --method_1=id_only --method_2=id_only/ablate_encoder --dataset=beauty

Specifically, the possible values of method_1 and method_2 are

id_only
id_only/ablate_encoder
id_only/ablate_negatives
id_only/ablate_init
text_only
text_only/ablate_encoder
text_only/ablate_negatives
text_only/ablate_lm

dataset can be one of

beauty
toys
sports
steam

To run baselines

bash scripts/run_baseline.sh --method=fdsa --dataset=beauty --seed=42

This will train and test the baseline method, which can be one of

id_only
test_only
llm_init
whitenrec
unisrec
rlmrec_con
rlmrec_gen
llm_esr
alphafuse
fdsa

To run ensemble ablation

First you should run EnsRec on the desired dataset using the command above. Then, run the notebook notebooks/ablate_ensrec.ipynb, setting dataset and seeds in the first code cell (the parameters cell) as appropriate.

To run trainng and testing, in general

You can train and test any model with chosen experiment configuration from configs/experiment/:

python src/train.py experiment=id_only/train_beauty

You can override any parameter from the command line like this:

python src/train.py experiment=id_only/train_beauty trainer.max_epochs=20 optim.optimizer.lr=0.001 model.d_model=64

Logging

Training and evaluation logs are logged in logs/. By default, metrics are logged in csv format. This can be changed to tensorboard records by passing logger=tensorboard.

🤝 Acknowledgements

This repo is based on the repo GRID: Generative Recommendation with Semantic IDs. It also adapts code from AlphaFuse: Learn ID Embeddings for Sequential Recommendation in Null Space of Language Embeddings for AlphaFuse, UniSRec, WhitenRec, and RLMRec implementations. It adopts data preprocessing settings and default hyperparameter settings from Unifying Generative and Dense Retrieval for Sequential Recommendation.

📞 Contact

For questions and support, please create a GitHub issue or contact Liam Collins ([email protected]).

📚 Citation

If you find our paper and/or code useful, please use the following citation:

@article{collins2025exploiting,
  title={Exploiting ID-Text Complementarity via Ensembling for Sequential Recommendation},
  author={Collins, Liam and Kumar, Bhuvesh and Ju, Clark Mingxuan and Zhao, Tong and Loveland, Donald and Neves, Leonardo and Shah, Neil},
  journal={arXiv preprint arXiv:2512.17820},
  year={2025}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

EnsRec: Ensembling ID + Text Sequence Encoders for Sequential Recommendation

⚡ Overview

📦 Installation

Prerequisites

Setup Environment

🗂️ Data preparation

Text embedding generation

🚀 How to run

To run EnsRec

To run baselines

To run ensemble ablation

To run trainng and testing, in general

Logging

🤝 Acknowledgements

📞 Contact

📚 Citation

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
configs		configs
data		data
logs		logs
notebooks		notebooks
outputs		outputs
results		results
scripts		scripts
src		src
LICENSE.txt		LICENSE.txt
README.md		README.md
notices.txt		notices.txt
requirements.txt		requirements.txt

License

snap-research/EnsRec

Folders and files

Latest commit

History

Repository files navigation

EnsRec: Ensembling ID + Text Sequence Encoders for Sequential Recommendation

⚡ Overview

📦 Installation

Prerequisites

Setup Environment

🗂️ Data preparation

Text embedding generation

🚀 How to run

To run EnsRec

To run baselines

To run ensemble ablation

To run trainng and testing, in general

Logging

🤝 Acknowledgements

📞 Contact

📚 Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages