You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We have data starting on February 4 2021 00:00:00 and up to February 24 2021 23:59:56. For a total of 21 days or 3 weeks. We can treat the data as a time series and hold out the last X days from the training set to use them as validation data.
Split train/test (X/Y%), grouping by tweet_id to keep all data points related to the same tweet in the same fold.
a. Maybe it doesn't make sense to use values in the future to predict values in the past, so we probably shouldn't use a validation set from the past to validate a model trained on future samples.
b. On the other hand, maybe tweets are (almost) stationary data, so they're not much affected by the shift in time -- also considering we're not talking about years, but it's a small one-month window so we can probably consider it stationary? -> See eda-stationarity.ipynb
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
How do we validate models?
Possible approaches:
tweet_idto keep all data points related to the same tweet in the same fold.a. Maybe it doesn't make sense to use values in the future to predict values in the past, so we probably shouldn't use a validation set from the past to validate a model trained on future samples.
b. On the other hand, maybe tweets are (almost) stationary data, so they're not much affected by the shift in time -- also considering we're not talking about years, but it's a small one-month window so we can probably consider it stationary? -> See eda-stationarity.ipynb
Beta Was this translation helpful? Give feedback.
All reactions