This repository provides materials for a session that is part of the I2DS Tools for Data Science workshop run at the Hertie School, Berlin in October 2023. The student-run workshop is part of the course Introduction to Data Science taught by Simon Munzert at the Hertie School, Berlin, in Fall 2023.
This session will introduce you to the modern data wrangling workflow with R and data.table. Data wrangling is one of the core steps in the data science workflow. data.table is a grammar of data manipulation, providing a consistent set of verbs that help you solve the most common data manipulation challenges by using 'i, j, by' indexing, and including the manipulation of datasets and variables.
The goals of this session are to (1) equip you with conceptual knowledge about the data.table package and data wrangling workflow, (2) show you the three key verbs of the package (i, j, by), and (3) provide you with practice material as well as some further readings.
The material in this repository is made available under the MIT license.
Sai Prusni Bandela prepared a basic syntax of data.table (i,j indexing)
Milton Mier prepared an advanced syntax of data.table (by argument)
Minho Kang prepared the presentation slides