The Iris flower data set is a multivariate data set introduced by the British statistician and biologist Ronald Fisher in his 1936 paper The use of multiple measurements in taxonomic problems. It is sometimes called Anderson's Iris data set because Edgar Anderson collected the data to quantify the morphologic variation of Iris flowers of three related species. The data set consists of 50 samples from each of three species of Iris (Iris Setosa, Iris virginica, and Iris versicolor). Four features were measured from each sample: the length and the width of the sepals and petals, in centimeters.
This dataset became a typical test case for many statistical classification techniques in machine learning such as support vector machines
The objective of the iris dataset is to classify iris flowers into one of three species based on the measurements of the sepal length, sepal width, petal length, and petal width. The three species are setosa, versicolor, and virginica. The dataset is often used as a classic machine learning example for classification and pattern recognition tasks.
The dataset contains a set of 150 records under 5 attributes - Petal Length, Petal Width, Sepal Length, Sepal width and Class(Species).
for CLASSIFICATION
- CSV reader
- File reader
- Table view
- Scatter plot
- Partitioning
- Decision tree learner
- Decision tree predictor
- Naive Bayes learner
- Naive Bayes Predictor
- SVM learner
- SVM predictor
For CLUSRTING
- CSV reader
- K-Means
- Table view
- color manager
- shape manager
- scatter plot
The conclusion of the Iris dataset analysis is that it is possible to use the measurements of the sepal length, sepal width, petal length, and petal width to accurately classify the three species of iris flowers. This dataset has been used extensively to demonstrate machine learning techniques for classification problems.
Azzam Jehtarhe