Machine learning brings together computer science and statistics to harness that predictive power.
This class teaches the end-to-end process of investigating data through a machine learning lens. It taught me how to extract and identify useful features that best represent data, a few of the most important machine learning algorithms including Naive Bayes, SVM, Decision Trees and unsupervised learning, and how to evaluate the performance of machine learning algorithms.
The goal of this project was to build a prediction model to identify persons-of-interest (POI’s.) using scikit learn, numpy, and pandas modules in Python. The target of the predictions were persons-of-interest (POI’s) who were individuals who were indicted, reached a settlement, or plea deal with the government, or testified in exchange for prosecution immunity. Financial compensation data and aggregate email statistics from the Enron Corpus were used as features for prediction.
Implemented with Python libraries NumPy, Pandas, Scikit learn, Matplotlib and Anaconda Jupyter Notebook.
Original inspirations from Udacity Course