You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Will "new" users appear in the test set? Namely, whether the users in the whole training set can cover the test set, or there is new user appearing in the test set?
We want to create user embedding, while it seems that there is no available GPU on test environment and we can't do inference to get user embedding for those new users efficiently. (it will exceed the time limit, or there would be GPU on test environment?)
Reply:
Hi, by design, in the original dataset, all engaging users (for whom you are making the prediction) in the test set should be in the set of engaging users from the training set.
Now since we are scrubbing the dataset and removing rows corresponding to tweets that were deleted, there's a small fraction (less than 1% currently) of users that disappeared from the training set and are still in the test set because the tweets they engaged with were removed). As for the engaged with users, new users can totally appear in the test or validation sets.
Note that XGBoost can impute missing values. So we should be fine if the number of cold users is relatively low.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
From the forum:
Reply:
Note that XGBoost can impute missing values. So we should be fine if the number of cold users is relatively low.
Beta Was this translation helpful? Give feedback.
All reactions