This repository reproduces the experiments in the paper: "Exploiting ID-Text Complementarity via Ensembling for Sequential Recommendation". This includes an implementation of EnsRec, our simple ID + Text ensembling strategy for sequential recommendation.
- Python 3.10+
- CUDA-compatible GPU (recommended)
# clone project
git clone ...
cd ensrec
# create conda environment
conda create -n ensrec python=3.11
conda activate ensrec
# install requirements
pip install -r requirements.txt --extra-index-url https://download.pytorch.org/whl/cu124Please download the preprocessed data from this Google Drive link. We have followed the same preprocessing as in LIGER.
Please unzip this folder and put it in the data/ folder, structured as follows:
data/
├── beauty/ # Beauty from Amazon Reviews 2018
├── training/ # training sequence of user history
├── evaluation/ # validation sequence of user history
├── testing/ # testing sequence of user history
└── items/ # text of all items in the dataset
├── sports/ # Sports from Amazon Reviews 2018
├── toys/ # Toys from Amazon Reviews 2018
└── steam/ # Steam
All folders beauty/, sports/, toys/ and steam/ should have training/, evaluation/, testing/ and items/ subfolders.
Please run notebooks/embedding_gen.ipynb to generate item text embeddings for each dataset. We use SentenceT5-XXL by default.
After completing this, the data is ready for experimentation.
bash scripts/run_ensrec.sh --method_1=id_only --method_2=text_only --dataset=beauty --seed=42This will first separately train ID-Only and Text-Only models and save their test user and item embeddings.
Then, it will run the notebook notebooks/test_ensrec.ipynb to evaluate test recommendation performance, including complementarity statistics.
You can also run a slightly modified version of EnsRec that trains ID-Only and Text-Only models in the same experiment run. Note that this is different from the EnsRec result we report since it early stops on the ensemble validation performance, instead of independently early stopping the ID-Only and Text-Only models based on their own validation performances (and only ensembling them for testing). To run this version:
bash scripts/run_ensrec.sh --method_1=id_only --method_2=text_only --dataset=beauty --seed=42 --one_jobTo reproduce the complementarity results in Table 2, method_1 and method_2 shoud be changed to any variant of ID-Only and/or Text-Only methods, e.g.
bash scripts/run_ensrec.sh --method_1=id_only --method_2=id_only/ablate_encoder --dataset=beauty Specifically, the possible values of method_1 and method_2 are
id_onlyid_only/ablate_encoderid_only/ablate_negativesid_only/ablate_inittext_onlytext_only/ablate_encodertext_only/ablate_negativestext_only/ablate_lm
dataset can be one of
beautytoyssportssteam
bash scripts/run_baseline.sh --method=fdsa --dataset=beauty --seed=42This will train and test the baseline method, which can be one of
id_onlytest_onlyllm_initwhitenrecunisrecrlmrec_conrlmrec_genllm_esralphafusefdsa
First you should run EnsRec on the desired dataset using the command above.
Then, run the notebook notebooks/ablate_ensrec.ipynb, setting dataset and seeds in the
first code cell (the parameters cell) as appropriate.
You can train and test any model with chosen experiment configuration from configs/experiment/:
python src/train.py experiment=id_only/train_beautyYou can override any parameter from the command line like this:
python src/train.py experiment=id_only/train_beauty trainer.max_epochs=20 optim.optimizer.lr=0.001 model.d_model=64Training and evaluation logs are logged in logs/. By default, metrics are logged in csv format. This can be changed to
tensorboard records by passing logger=tensorboard.
This repo is based on the repo GRID: Generative Recommendation with Semantic IDs. It also adapts code from AlphaFuse: Learn ID Embeddings for Sequential Recommendation in Null Space of Language Embeddings for AlphaFuse, UniSRec, WhitenRec, and RLMRec implementations. It adopts data preprocessing settings and default hyperparameter settings from Unifying Generative and Dense Retrieval for Sequential Recommendation.
For questions and support, please create a GitHub issue or contact Liam Collins ([email protected]).
If you find our paper and/or code useful, please use the following citation:
@article{collins2025exploiting,
title={Exploiting ID-Text Complementarity via Ensembling for Sequential Recommendation},
author={Collins, Liam and Kumar, Bhuvesh and Ju, Clark Mingxuan and Zhao, Tong and Loveland, Donald and Neves, Leonardo and Shah, Neil},
journal={arXiv preprint arXiv:2512.17820},
year={2025}
}