This repository is for Continuous Integration of my simple k-Nearest Neighbors (kNN) algorithm to pypi package.
For notebook version, please visit this repository
k-Nearest Neighbors, kNN for short, is a very simple but powerful technique used for making predictions. The principle behind kNN is to use “most similar historical examples to the new data.”
- Choose a value for k
- Find the distance of the new point to each record of training data
- Get the k-Nearest Neighbors
- Making Predictions
- For a classification problem, the new data point belongs to the class that most of the neighbors belong to.
- For a regression problem, the prediction can be an average or weighted average of the labels of k-Nearest Neighbors
Finally, we evaluate the model using the k-Fold Cross Validation technique
This technique involves randomly dividing the dataset into k approximately equal-sized groups, or folds. The first fold is kept for testing, and the model is trained on the remaining k-1 folds.
pip install simple-kNN
from simple_kNN.distanceMetrics import distanceMetrics
from simple_kNN.kFoldCV import kFoldCV
from simple_kNN.kNNClassifier import kNNClassifier
- My medium article on building kNN from scratch
- More info on Cross Validation can be seen here
- kNN
- kFold Cross Validation
- Other variants of the kNN algorithm
- Recommendations using the kNN algorithm