In this Problem Set we will solve the Unsupervised problem using k-means clustering algorithm.
Read the wine data from the link provided below. Split the wine data into X and y. The X should have the features associated with each class of wine. The y should indicate the type of wine.
Peform PCA and extract the top two components.
Generate a scatter plot for the 2 components generated by PCA. Do they appear to be in clusters of 3?
The referece plot is given below.
Run a k-means clustering model for the input data. This should generate the cluster centoids. Perform this for a value of k=3 and plot the cluster centroid vs. data points in that cluster as a scatter plot.
To check how well k-means performed, print the prediction accuracy and plot the confusion matrix. It is not straightforward to print the accuracy score. Makesure to match the predicted wine class to the original wine class and then print the accuracy.
Run the KMeans model for different values of