MlBayesOpt/README.Rmd at master · ymattu/MlBayesOpt · GitHub

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
---
output: github_document
---

<!-- README.md is generated from README.Rmd. Please edit that file -->

```{r, echo = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "README-"
)
```

# MlBayesOpt <img src="man/figures/logo.png" align="right" />

[![CRAN\_Status\_Badge](http://www.r-pkg.org/badges/version/MlBayesOpt)](https://cran.r-project.org/package=MlBayesOpt)
[![Build Status](https://travis-ci.org/ymattu/MlBayesOpt.svg?branch=master)](https://travis-ci.org/ymattu/MlBayesOpt)
[![AppVeyor Build Status](https://ci.appveyor.com/api/projects/status/github/ymattu/MlBayesOpt?branch=master&svg=true)](https://ci.appveyor.com/project/ymattu/MlBayesOpt)
[![Coverage Status](https://img.shields.io/codecov/c/github/ymattu/MlBayesOpt/master.svg)](https://codecov.io/github/ymattu/MlBayesOpt?branch=master)

## Overview
This is an R package to tune hyperparameters for machine learning algorithms
using Bayesian Optimization based on Gaussian Processes. Algorithms currently supported are: Support Vector Machines, Random Forest, and XGboost.

## Why MlBayesOpt ?
### Easy to write
It's very easy to write Bayesian Optimaization function, but you also able to customise your model very easily. You have only to specify the data and the column name of the label to classify.

On XgBosst functions, your data frame is automatically transformed into `xgb.DMatrix` class.

### Any label class is OK
Any class (character, integer, factor) of label column is OK. The class of the label column is automatically transformed.


## Installation
```{r eval=FALSE}
install.packages("MlBayesOpt")
```

You can also install MlBayesOpt from github with:

```{r gh-installation, eval = FALSE}
# install.packages("githubinstall")
githubinstall::githubinstall("MlBayesOpt")

# install.packages("devtools")
devtools::install_github("ymattu/MlBayesOpt")
```

## Data
### Small Fashion MNIST
`fashion_train` and `fashion_test` are data reproduced from [Fashion-MNIST](https://github.com/zalandoresearch/fashion-mnist). Each data has 1,000 rows and 784 feature column, and 1 label column named `y`.

`fashion` is a data made by the function `dplyr::bind_rows(fashion_train, fashion_test)`.

### iris
`iris_train` and `iris_test` are included in this pacakge. `iris_train` is odd-numbered rows of `iris` data, and `iris_test`is even-numbered rows of `iris` data.

## Example
### 3-fold cross validation for `iris` data, using SVM.
```{r load, echo=FALSE, message=FALSE}
devtools::load_all()
```

```{r example}
library(MlBayesOpt)

set.seed(71)
res0 <- svm_cv_opt(data = iris,
                   label = Species,
                   n_folds = 3,
                   init_points = 10,
                   n_iter = 1)
```


### 3-fold cross validation for `iris` data, using Xgboost.
```{r}
res0 <- xgb_cv_opt(data = iris,
                   label = Species,
                   objectfun = "multi:softmax",
                   evalmetric = "mlogloss",
                   n_folds = 3,
                   classes = 3,
                   init_points = 10,
                   n_iter = 1)

```


## For Details
See the [vignette](https://ymattu.github.io/MlBayesOpt/articles/MlBayesOpt.html)

## ToDo
- [x] Make functions to execute cross validation
- [ ] Fix minor bugs