In this chapter, we discuss the process of data manipulation, learn how to explore an API to gather data, and perform data cleaning and reshaping with pandas.
There are five notebooks that we will work through, each numbered according to when they will be used:
1-wide_vs_long.ipynb: discusses wide versus long format data2-using_the_weather_api.ipynb: walks through collecting daily temperature data from the NCEI API3-cleaning_data.ipynb: shows how to perform some initial data cleaning4-reshaping_data.ipynb: illustrates how to reshape data withpandas5-handling_data_issues.ipynb: showcases strategies for dealing with duplicate, missing, or invalid data
All the datasets necessary for the aforementioned notebooks, along with information on them, can be found in the data/ directory. The end-of-chapter exercises will use the datasets in the exercises/ directory; solutions to these exercises can be found in the repository's solutions/ch_03/ directory.