I'm not sure why "diamonds" is not included.
Also, you have "houses" and "boston" which are two housing datasets (though quite different in nature?). But neither is similar at all to the ames housing dataset or the king's county housing dataset (which have many high cardinality categorical variables).
So I'd probably include either of those?
Also, 2dplanes is a synthetic dataset, and so is mv.
I'm not sure why "diamonds" is not included.
Also, you have "houses" and "boston" which are two housing datasets (though quite different in nature?). But neither is similar at all to the ames housing dataset or the king's county housing dataset (which have many high cardinality categorical variables).
So I'd probably include either of those?
Also, 2dplanes is a synthetic dataset, and so is mv.