-
Notifications
You must be signed in to change notification settings - Fork 464
Description
Re "observational unit" and "level", I still think a better definition would be useful, perhaps in 2.3, perhaps in 3.4.
I wonder if the observations about denormalization (parag 3 of 3.4 and last parag of 4.1) might be worth elaborating on? Or maybe not... the examples of join in 5 might be enough.
I really like the "tidy data" framework and the way it highlights the cleanness and unity of the tools you've developed.
On p. 22, line 4, I think the comma after "data" is inappropriate because the "which" starts a restrictive clause.
About the ETL literature, I agree that a lot of it is either commercial or very IT-ey (as opposed to CS-ey), but there does seem to be a fair amount of discussion around data warehouse schemas. Some other papers I've come across (I don't claim to have done a serious literature search!) include Boehnlein, Vassiliadis et al., and Vassiliadis (p. 9ff talks about the "pivoting problem").