Skip to content

[BUG] Running EASE model with 600k ratings crashes with out of memory error #654

@filippo-orru

Description

@filippo-orru

Description

I've been trying out various cornac models and had some success. However, running the EASE model on a training set with ~600k ratings never succeeds. Memory consumption starts at around 500MB, then quickly grows to ~60GB until, at some point, the process is killed.
Activity Monitor 2024-12-25 18 18 28

In which platform does it happen?

MacOS 15.2 running on an M1 Pro with 16GB of memory.

How do we replicate the issue?

Minimal example:

import cornac
from cornac.eval_methods import RatioSplit
from cornac.models import EASE
from cornac.metrics import NDCG
import pandas as pd

path = "training_data/training_data_ratings_20241218_224839.parquet.snappy"
df_original = pd.read_parquet(path)
print("Loaded data")

# Convert dataframe to list of tuples (user_id, item_id, rating)
data = df_original[["userID", "itemID", "rating"]].values.tolist()
rs = RatioSplit(data, test_size=0.15, val_size=0.1, rating_threshold=3.0)
print(f"{len(data)} ratings: {rs.train_size} training and {rs.test_size} test")

ndcg = NDCG(k=10)
metrics = [ndcg]

ease = EASE()
models = [ease]

cornac.Experiment(eval_method=rs, models=models, metrics=metrics, user_based=True, verbose=True).run()

print("Done!")

ease.recommend("my_user_id", k=10) # Never reaches this point

Expected behavior (i.e. solution)

The experiment should run successfully and output the results. 600k training samples isn't that much, and the model is extremely simple. I don't see how it would need this much memory.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions