Skip to content

khanjan2708/Prospace_Assignment

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 

Repository files navigation

Prospace_Assignment

From the data it was determined that learning is going to be Unsupervised Learning as there was no Labels associated with image data. As a first step, Data Exploration is done in which IMAGE SHAPE, VISUALISATION OF IMAGE, COLOR AND COLOR DISTRIBUTION was analysed. Result of the analysis was that Each image has same shape but while doing visualisation it was determined that there is unequal and unnecessary blank spacing in the images which decreases the quality of the data. Hence It was decided to first get ride of those white spaces which in the image array can be seen as ZERO values. Also from the analysis it was concluded that Image has mostly BROWN AND GREEN colors. And the distribtuion of this color in the images helps to classify the images in wanted THREE Classes : No Crop, Growing, Lush.

As it was already interpreted from the data that Color distribution is going to help in classify the images, Hence method called extract_color_feature was used in which firstly unnecessary zero values was dropeed and color was distributed using Histogram and the value count was stored in hist array. This array is going to be our primary DATA on which model will be applied. Now as this is unsupervised learing and we already know how many cluster is needed, K-Means Clustering seem most applicable approach to be used as model. But before using clustering to get only important features it was necessary to use Principal Component Analysis. Before using any random component value I did maximum exlained variance to get to know from which value 95% variance is been explained and used that value as my pca_component. After this that data was used in clustering. Labels from K-Means clustering was then labeled according to manual observation and output_class_dataframe which contains asked class information is formed. To evaluate the clustering SILHOUETTE SCORE is been used which tells how separable our clusters is.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published