|
2 | 2 | Python package for plug and play cross validation techniques. |
3 | 3 | If you like the idea or you find usefull this repo in your job, please leave a ⭐ to support this personal project. |
4 | 4 |
|
5 | | -The documentation will grow with all the information about all the cross-validation techniques. |
6 | | - |
7 | 5 | * Cross Validation methods: |
8 | | - * K-fold; |
9 | | - * Leave One Out (LOO); |
10 | | - * Leave One Subject Out (LOSO). |
| 6 | + * [K-fold](#k-fold); |
| 7 | + * [Leave One Out (LOO)](#leave-one-out-loo); |
| 8 | + * [Leave One Subject Out (LOSO)](#leave-one-subject-out-loso). |
| 9 | + |
| 10 | +At the moment the package is not available using `pip install <PACKAGE-NAME>`. |
| 11 | + |
| 12 | +For the installation from the source code click **[here](#installation)**. |
| 13 | + |
| 14 | +Each method returns the confusion matrix and some performance metrics for each itheration and for the overall result. |
| 15 | +The performance metrics are: |
| 16 | +* Balanced Accuracy; |
| 17 | +* F1 Score; |
| 18 | +* Matthews Correlation Coefficient. |
11 | 19 |
|
12 | 20 | ## K-fold |
| 21 | +K-fold consists of partitioning the dataset into k subsets; iteratively one of the k subsets is the test set and the others are the training set. |
| 22 | +The value of k could be chosen according to the amount of available data. Increasing the value of k the result is enlarging the training set and decreasing the size of the test set. |
| 23 | +Tipically, the default value of k is between 5 to 10, this is a good trade of between a robust validation and computational time. |
| 24 | +After a k-fold cross validation all the data set has been tested and it is possible to generate a confusion matrix and compute some performance metrics to validate the generalization capabilities of your model. |
| 25 | + |
| 26 | + |
| 27 | +***K-fold cross-validation concept illustration** Each row represents an iteration of the cross-validation; in blue, there are the subsets labeled as training set and in orange, the subset defined as test set for the i-th iteration. |
| 28 | +At the end, each subset has been tested getting the outcome, that could be compared to the real outputs of the instances* |
| 29 | + |
| 30 | +### Example |
| 31 | +```python |
| 32 | +from cross_validation.cross_validation import kfold |
| 33 | + |
| 34 | +clf = RandomForestClassifier() |
| 35 | +[cm, perf] = kfold(clf, X, y, verbose=True) |
| 36 | +``` |
13 | 37 |
|
14 | 38 | ## Leave One Out (LOO) |
| 39 | +Leave-one-out (LOO) is a particular case of the k-fold when the value of k is equal to the number of data points in the dataset. |
| 40 | +This method should be used when the data set has few samples; this guarantees to have enough data point for the model training; after the training phase only one point will be evaluated by the model. |
| 41 | + |
| 42 | +### Example |
| 43 | +```python |
| 44 | +from cross_validation.cross_validation import leave_one_out |
| 45 | + |
| 46 | +clf = RandomForestClassifier() |
| 47 | +[cm, perf] = leave_one_out(clf, X, y, verbose=True) |
| 48 | +``` |
15 | 49 |
|
16 | 50 | ## Leave One Subject Out (LOSO) |
| 51 | +This method could be considered as a different version of the leave-one-out cross-validation. This method works leaving as a test set not a single example, but the entire examples that belong to a specific subject. The other subjects’ instances are used to train the learning algorithm. |
| 52 | +The main advantage of the LOSO is the removal of the subject bias because all the instances of the are the test set. |
| 53 | +This technique of cross-validation is widely used in the biomedical field where the the main task is to predict a disease or a condition of a patient using data of other patients. |
| 54 | + |
| 55 | +### Example |
| 56 | +```python |
| 57 | +from cross_validation.cross_validation import leave_one_subject_out |
| 58 | + |
| 59 | +clf = RandomForestClassifier() |
| 60 | +[cm, perf] = leave_one_subject_out(clf, X, y, subject_ids, verbose=True): |
| 61 | +``` |
| 62 | + |
| 63 | +## Installation |
| 64 | +For the installation from the source code type this command into your terminal window: |
| 65 | +``` |
| 66 | +pip install git+<repository-link> |
| 67 | +``` |
| 68 | +or |
| 69 | +``` |
| 70 | +python -m pip install git+<repository-link> |
| 71 | +``` |
| 72 | +or |
| 73 | +``` |
| 74 | +python3 -m pip install git+<repository-link> |
| 75 | +``` |
0 commit comments