Build a pipeline that, given a document, will tell you whether Olmo would have filter it out of the training data or not during data pre-filtering.
Build a pipeline that, given a document, will tell you whether Olmo would have filter it out of the training data or not during data pre-filtering.