Considering a dataset with $p$ numerical attributes.
The goal of the autoencoder is to reduce the dimension of $p$ to $k$, such that these $k$ attributes are enough to recompose the original $p$ attributes. However from the $k$ dimensionals the data is returned back to $p$ dimensions. The higher the quality of autoencoder the similiar is the output from the input.
# DAL ToolBox
# version 1.0.777
source("https://raw.githubusercontent.com/cefet-rj-dal/daltoolbox/main/jupyter.R")
#loading DAL
load_library("daltoolbox")
Loading required package: daltoolbox Registered S3 method overwritten by 'quantmod': method from as.zoo.data.frame zoo Attaching package: ‘daltoolbox’ The following object is masked from ‘package:base’: transform
data(sin_data)
sw_size <- 5
ts <- ts_data(sin_data$y, sw_size)
ts_head(ts)
t4 | t3 | t2 | t1 | t0 |
---|---|---|---|---|
0.0000000 | 0.2474040 | 0.4794255 | 0.6816388 | 0.8414710 |
0.2474040 | 0.4794255 | 0.6816388 | 0.8414710 | 0.9489846 |
0.4794255 | 0.6816388 | 0.8414710 | 0.9489846 | 0.9974950 |
0.6816388 | 0.8414710 | 0.9489846 | 0.9974950 | 0.9839859 |
0.8414710 | 0.9489846 | 0.9974950 | 0.9839859 | 0.9092974 |
0.9489846 | 0.9974950 | 0.9839859 | 0.9092974 | 0.7780732 |
preproc <- ts_norm_gminmax()
preproc <- fit(preproc, ts)
ts <- transform(preproc, ts)
ts_head(ts)
t4 | t3 | t2 | t1 | t0 |
---|---|---|---|---|
0.5004502 | 0.6243512 | 0.7405486 | 0.8418178 | 0.9218625 |
0.6243512 | 0.7405486 | 0.8418178 | 0.9218625 | 0.9757058 |
0.7405486 | 0.8418178 | 0.9218625 | 0.9757058 | 1.0000000 |
0.8418178 | 0.9218625 | 0.9757058 | 1.0000000 | 0.9932346 |
0.9218625 | 0.9757058 | 1.0000000 | 0.9932346 | 0.9558303 |
0.9757058 | 1.0000000 | 0.9932346 | 0.9558303 | 0.8901126 |
samp <- ts_sample(ts, test_size = 10)
train <- as.data.frame(samp$train)
test <- as.data.frame(samp$test)
Reduce from 5 to 3 dimensions
auto <- autoenc_encode_decode(5, 3)
auto <- fit(auto, train)
presenting the original test set and display encoding
print(head(test))
result <- transform(auto, test)
print(head(result))
t4 t3 t2 t1 t0 1 0.7258342 0.8294719 0.9126527 0.9702046 0.9985496 2 0.8294719 0.9126527 0.9702046 0.9985496 0.9959251 3 0.9126527 0.9702046 0.9985496 0.9959251 0.9624944 4 0.9702046 0.9985496 0.9959251 0.9624944 0.9003360 5 0.9985496 0.9959251 0.9624944 0.9003360 0.8133146 6 0.9959251 0.9624944 0.9003360 0.8133146 0.7068409 [,1] [,2] [,3] [,4] [,5] [1,] 0.7281464 0.8297843 0.9119417 0.9712389 0.9987636 [2,] 0.8298926 0.9120755 0.9692511 0.9992059 0.9964882 [3,] 0.9135329 0.9704542 0.9987994 0.9969148 0.9618803 [4,] 0.9685430 0.9973767 0.9959966 0.9621131 0.9001518 [5,] 0.9985738 0.9968489 0.9646484 0.8984091 0.8126789 [6,] 0.9928818 0.9628512 0.9030885 0.8135285 0.7084026