Skip to content

Commit 214275e

Browse files
authored
Fixed memory leak, fixed github actions, updated README (#17)
* Create cran.yml (#15) * Upped version number * Ignore github actions folder * Updated manuals using Roxygen2 * Added virtual deconstructor to prevent memory leak * Cleaned compiler flags * Adjusted testthat catch entrypoint to new definition * Added DS_Store to gitignore * Added DS_Store to gitignore * Create codecov.yml * Added covr, not running all tests yet * LTO warnings - deleted extra ; * Ignore cpp test files in test coverage * Fixed incomplete final line * Running tests on all platforms, not excluding mac os anymore since testing issue (invalid pointer operation) has been fixed * Running codecov on ubuntu-20.01 * Run on push / PR on dev and master branches * Changed back to running on mac os * Updated README and included usage examples * Exclude README assets from R build * Updated autoencoder usage example
1 parent 31b5fa2 commit 214275e

File tree

9 files changed

+160
-9
lines changed

9 files changed

+160
-9
lines changed

.Rbuildignore

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,3 +3,5 @@
33
^\.Rproj\.user$
44
.github
55
.covrignore
6+
man/images
7+
man/README_example.R

.covrignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1 +1,2 @@
11
inst
2+
src/test*

.github/workflows/cran.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -3,11 +3,11 @@
33
on:
44
push:
55
branches:
6-
- main
6+
- dev
77
- master
88
pull_request:
99
branches:
10-
- main
10+
- dev
1111
- master
1212

1313
name: R-CMD-check

README.md

Lines changed: 109 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,120 @@
1+
2+
# ANN2
3+
14
[![Licence](https://img.shields.io/badge/licence-GPL--3-blue.svg)](https://www.gnu.org/licenses/gpl-3.0.en.html)
25
[![CRAN\_Status\_Badge](http://www.r-pkg.org/badges/version/ANN2)](https://cran.r-project.org/package=ANN2)
36
![Monthly downloads](https://cranlogs.r-pkg.org/badges/ANN2)
47
![R CMD check](https://github.com/bflammers/ANN2/workflows/R-CMD-check/badge.svg)
58
[![codecov](https://codecov.io/gh/bflammers/ANN2/branch/dev/graph/badge.svg)](https://codecov.io/gh/bflammers/ANN2)
69

7-
# ANN2
810
Artificial Neural Networks package for R
911

10-
Training of neural networks for classification and regression tasks using mini-batch gradient descent. Special features include a function for training autoencoders, which can be used to detect anomalies, and some related plotting functions. Multiple activation functions are supported, including tanh, relu, step and ramp. For the use of the step and ramp activation functions in detecting anomalies using autoencoders, see Hawkins et al. (2002). Furthermore, several loss functions are supporterd, including robust ones such as Huber and pseudo-Huber loss, as well as L1 and L2 regularization. The possible options for optimization algorithms are RMSprop, Adam and SGD with momentum. The package contains a vectorized C++ implementation that facilitates fast training through mini-batch learning.
12+
This package allows to train neural networks for classification and regression tasks, as well as autoencoders for anomaly detection. Several helper and plotting functions are included for improved usability and understanding what the model does. ANN2 contains a vectorized neural net implementation in C++ that facilitates fast training through mini-batch gradient descent.
13+
14+
ANN2 contains the following features:
15+
* Easy to use interface - defining and training neural nets with a single function call!
16+
* Activation functions: tanh, sigmoid, relu, linear, ramp, step
17+
* Loss functions: log, squared, absolute, huber, pseudo-huber
18+
* Regularization: L1, L2
19+
* Optimizers: sgd, sgd w/ momentum, RMSprop, ADAM
20+
* Ploting functions for visualizing encodings, reconstructions and loss (training and validation)
21+
* Helper functions for predicting, reconstructing, encoding and decoding
22+
* Reading and writing the trained model from / to disk
23+
* Access to model parameters and low-level Rcpp module methods
24+
25+
## Usage
26+
27+
Defining and training a multilayer neural network with ANN2 is done using a single function call to either:
28+
* `neuralnetwork()` - for a multilayer neural net for classification or regression,
29+
* `autoencoder()` - for an autoencoding neural network that is trained to reconstruct its inputs.
30+
31+
Below are two examples with minimal code snippets that show how to use these functions.
32+
33+
### `neuralnetwork()`
34+
35+
We'll train a neural network with dimensions 4 x 5 x 5 x 3 on the Iris data set that classifies whether each observation (sepal length and width and petal length and width measurements for three species of floweres) belongs to class setosa, versicolor or virginica. The dimensions of the input and output layers are inferred from the data, the hidden layer dimensions are defined by providing a single vector that specifies the number of nodes for each hidden layer as argument `hidden.layers`.
36+
37+
``` r
38+
library(ANN2)
39+
40+
# Prepare test and train sets
41+
random_idx <- sample(1:nrow(iris), size = 145)
42+
X_train <- iris[random_idx, 1:4]
43+
y_train <- iris[random_idx, 5]
44+
X_test <- iris[setdiff(1:nrow(iris), random_idx), 1:4]
45+
y_test <- iris[setdiff(1:nrow(iris), random_idx), 5]
46+
47+
# Train neural network on classification task
48+
NN <- neuralnetwork(X = X_train,
49+
y = y_train,
50+
hidden.layers = c(5, 5),
51+
optim.type = 'adam',
52+
n.epochs = 5000)
53+
54+
# Predict the class for new data points
55+
predict(NN, X_test)
56+
57+
# $predictions
58+
# [1] "setosa" "setosa" "setosa" "versicolor" "versicolor"
59+
#
60+
# $probabilities
61+
# class_setosa class_versicolor class_virginica
62+
# [1,] 0.9998184126 0.0001814204 1.670401e-07
63+
# [2,] 0.9998311154 0.0001687264 1.582390e-07
64+
# [3,] 0.9998280223 0.0001718171 1.605735e-07
65+
# [4,] 0.0001074780 0.9997372337 1.552883e-04
66+
# [5,] 0.0001077757 0.9996626441 2.295802e-04
67+
68+
# Plot the training and validation loss
69+
plot(NN)
70+
```
71+
![](man/images/nn_loss.png)
72+
73+
74+
You can interact with the resulting `ANN` object using methods `plot()`, `print()` and `predict()`. Storing and loading the model to/from disk can be done using `write_ANN()` and `read_ANN()`, respectively. Other, more low-level, methods of the C++ module can be accessed through the `$` operator as members of the object, eg. `NN$Rcpp_ANN$getParams()` for getting the parameters (weight matrices and bias vectors) from the trained model.
75+
76+
### `autoencoder()`
77+
78+
The `autoencoder()` function allows to train an autoencoding neural network. In the next example we'll train an autoencoder of dimension 4 x 10 x 2 x 10 x 4 on the USArrests data set. The middle hidden layer acts as a bottleneck that forces the autoencoder to only retain structural variation and discard random variation. By denoting data points that are poorly reconstructed (high reconstruction error) as aberant, we exploit this denoising property for anomaly detection.
79+
80+
``` r
81+
# Prepare test and train sets
82+
random_idx <- sample(1:nrow(USArrests), size = 45)
83+
X_train <- USArrests[random_idx,]
84+
X_test <- USArrests[setdiff(1:nrow(USArrests), random_idx),]
85+
86+
# Define and train autoencoder
87+
AE <- autoencoder(X = X_train,
88+
hidden.layers = c(10,3,10),
89+
loss.type = 'pseudo-huber',
90+
optim.type = 'adam',
91+
n.epochs = 5000)
92+
93+
# Reconstruct test data
94+
reconstruct(AE, X_test)
95+
96+
# $reconstructed
97+
# Murder Assault UrbanPop Rape
98+
# [1,] 8.547431 243.85898 75.60763 37.791746
99+
# [2,] 12.972505 268.68226 65.40411 29.475545
100+
# [3,] 2.107441 78.55883 67.75074 15.040075
101+
# [4,] 2.085750 56.76030 55.32376 9.346483
102+
# [5,] 12.936357 252.09209 56.24075 24.964715
103+
#
104+
# $anomaly_scores
105+
# [1] 398.926431 247.238111 11.613522 0.134633 1029.806121
106+
107+
# Plot original points (grey) and reconstructions (colored) for training data
108+
reconstruction_plot(AE, X_train)
109+
```
110+
111+
![](man/images/ae_reconstruction_plot.png)
112+
113+
In the reconstruction plot we see the original points (grey) along with their reconstructions (color scale based on reconstruction error), connected to each other by a grey line. The length of the line denotes the reconstruction error.
114+
115+
You can interact with the `ANN` object that results from the `autoencoder()` function call using various methods, including `plot()`, `encode()`, `decode()` and `reconstruct()`.
116+
117+
More details on supported arguments to `neuralnetwork()` and `autoencoder()`, as well as examples and explanations on using the helper and plotting functions can be found in the manual.
11118

12119
***
13120

inst/cereal/types/memory.hpp

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -422,4 +422,4 @@ namespace cereal
422422
#include "cereal/types/polymorphic.hpp"
423423

424424
#undef CEREAL_ALIGNOF
425-
#endif // CEREAL_TYPES_SHARED_PTR_HPP_
425+
#endif // CEREAL_TYPES_SHARED_PTR_HPP_

man/README_example.R

Lines changed: 44 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,44 @@
1+
library(ANN2)
2+
3+
#### NEURALNETWORK
4+
5+
# Prepare test and train sets
6+
random_idx <- sample(1:nrow(iris), size = 145)
7+
X_train <- iris[random_idx, 1:4]
8+
y_train <- iris[random_idx, 5]
9+
X_test <- iris[setdiff(1:nrow(iris), random_idx), 1:4]
10+
y_test <- iris[setdiff(1:nrow(iris), random_idx), 5]
11+
12+
# Train neural network on classification task
13+
NN <- neuralnetwork(X = X_train,
14+
y = y_train,
15+
hidden.layers = c(5, 5),
16+
optim.type = 'adam',
17+
n.epochs = 5000)
18+
19+
# Predict the class for new data points
20+
predict(NN, X_test)
21+
22+
# Plot the training and validation loss
23+
plot(NN)
24+
25+
#### AUTOENCODER
26+
27+
# Prepare test and train sets
28+
random_idx <- sample(1:nrow(USArrests), size = 45)
29+
X_train <- USArrests[random_idx,]
30+
X_test <- USArrests[setdiff(1:nrow(USArrests), random_idx),]
31+
32+
# Define and train autoencoder
33+
AE <- autoencoder(X = X_train,
34+
hidden.layers = c(10,3,10),
35+
loss.type = 'pseudo-huber',
36+
optim.type = 'adam',
37+
n.epochs = 5000)
38+
39+
# Plot original points (grey) and reconstructions (colored)
40+
reconstruction_plot(AE, X_train)
41+
42+
# Reconstruct test data
43+
reconstruct(AE, X_test)
44+
135 KB
Loading

man/images/nn_loss.png

42.6 KB
Loading

tests/testthat.R

Lines changed: 1 addition & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,4 @@
11
library(testthat)
22
library(ANN2)
33

4-
# Only test if not on mac
5-
if (tolower(Sys.info()[["sysname"]]) != "darwin") {
6-
test_check("ANN2")
7-
}
4+
test_check("ANN2")

0 commit comments

Comments
 (0)