Course taught by COURSERA
This repository contains the following files:
README.md
, provides an overview of the dataset and how it was created.CodeBook.md
, describes the contents of the data set (data, variables and transformations used to generate the data).Run_Analysis.R
, the R script that was used to create the data setTidy_Data.txt
, contains the data set.
The source dataset was based on the human activity recognition project using smartphones, which describes how the data was initially collected as follows:
The experiments have been carried out with a group of 30 volunteers within an age bracket of 19-48 years. Each person performed six activities (WALKING, WALKING_UPSTAIRS, WALKING_DOWNSTAIRS, SITTING, STANDING, LAYING) wearing a smartphone (Samsung Galaxy S II) on the waist. Using its embedded accelerometer and gyroscope, we captured 3-axial linear acceleration and 3-axial angular velocity at a constant rate of 50Hz. The experiments have been video-recorded to label the data manually. The obtained dataset has been randomly partitioned into two sets, where 70% of the volunteers was selected for generating the training data and 30% the test data. The sensor signals (accelerometer and gyroscope) were pre-processed by applying noise filters and then sampled in fixed-width sliding windows of 2.56 sec and 50% overlap (128 readings/window). The sensor acceleration signal, which has gravitational and body motion components, was separated using a Butterworth low-pass filter into body acceleration and gravity. The gravitational force is assumed to have only low frequency components, therefore a filter with 0.3 Hz cutoff frequency was used. From each window, a vector of features was obtained by calculating variables from the time and frequency domain.
The R Run_Analysis.R script can be used to create the dataset. Retrieves the source dataset and transforms it to produce the final dataset by implementing the following steps:
- Download and unzip the source data if it doesn't exist. Download here
- Combine the training and test sets to create a data set.
- Extract only the measurements in the mean and standard deviation for each measurement.
- Use descriptive activity names to name the activities in the dataset.
- Properly label the dataset with descriptive variable names.
- Create a second independent ordered set with the average of each variable for each activity and each topic.
This script requires the package "plyr" and "data.table".