IMDB Sentiment Analysis (Warm-Start From Pretrained Embedding)

Biostat 203B

Author

Dr. Hua Zhou @ UCLA

Published

February 28, 2024

1 Setup

Display system information for reproducibility.

import IPython
print(IPython.sys_info())
{'commit_hash': '8b1204b6c',
 'commit_source': 'installation',
 'default_encoding': 'utf-8',
 'ipython_path': '/opt/venv/lib/python3.10/site-packages/IPython',
 'ipython_version': '8.21.0',
 'os_name': 'posix',
 'platform': 'Linux-6.6.12-linuxkit-aarch64-with-glibc2.35',
 'sys_executable': '/opt/venv/bin/python',
 'sys_platform': 'linux',
 'sys_version': '3.10.12 (main, Nov 20 2023, 15:14:05) [GCC 11.4.0]'}
sessionInfo()
R version 4.3.2 (2023-10-31)
Platform: aarch64-unknown-linux-gnu (64-bit)
Running under: Ubuntu 22.04.3 LTS

Matrix products: default
BLAS:   /usr/lib/aarch64-linux-gnu/openblas-pthread/libblas.so.3 
LAPACK: /usr/lib/aarch64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so;  LAPACK version 3.10.0

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

time zone: Etc/UTC
tzcode source: system (glibc)

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

loaded via a namespace (and not attached):
 [1] digest_0.6.34     fastmap_1.1.1     xfun_0.42         Matrix_1.6-1.1   
 [5] lattice_0.21-9    reticulate_1.35.0 knitr_1.45        htmltools_0.5.7  
 [9] png_0.1-8         rmarkdown_2.25    cli_3.6.2         grid_4.3.2       
[13] compiler_4.3.2    rstudioapi_0.15.0 tools_4.3.2       evaluate_0.23    
[17] Rcpp_1.0.12       yaml_2.3.8        rlang_1.1.3       jsonlite_1.8.8   
[21] htmlwidgets_1.6.4

Load libraries.

# Plotting tool
import matplotlib.pyplot as plt
# Load Tensorflow and Keras
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
import tensorflow_hub as hub
library(keras)
library(tfhub)

Source: https://tensorflow.rstudio.com/tutorials/keras/text_classification_with_hub

2 Prepare data

Different from earlier experiment of fitting LSTM on IMDB data, we will start from the original raw text of IMDB reviews.

We download the IMDB dataset from a static url (if it’s not already in the cache):

if (dir.exists("aclImdb/"))
  unlink("aclImdb/", recursive = TRUE)
url <- "https://ai.stanford.edu/~amaas/data/sentiment/aclImdb_v1.tar.gz"
dataset <- get_file(
  "aclImdb_v1",
  url,
  untar = TRUE,
  cache_dir = '.',
  cache_subdir = ''
)
Downloading data from https://ai.stanford.edu/~amaas/data/sentiment/aclImdb_v1.tar.gz

    8192/84125825 [..............................] - ETA: 0s
   57344/84125825 [..............................] - ETA: 1:49
  106496/84125825 [..............................] - ETA: 1:55
  172032/84125825 [..............................] - ETA: 1:55
  204800/84125825 [..............................] - ETA: 2:04
  237568/84125825 [..............................] - ETA: 2:23
  319488/84125825 [..............................] - ETA: 2:09
  352256/84125825 [..............................] - ETA: 2:09
  385024/84125825 [..............................] - ETA: 2:11
  434176/84125825 [..............................] - ETA: 2:30
  532480/84125825 [..............................] - ETA: 2:11
  565248/84125825 [..............................] - ETA: 2:17
  729088/84125825 [..............................] - ETA: 1:54
  827392/84125825 [..............................] - ETA: 1:46
  942080/84125825 [..............................] - ETA: 1:37
 1073152/84125825 [..............................] - ETA: 1:30
 1187840/84125825 [..............................] - ETA: 1:25
 1335296/84125825 [..............................] - ETA: 1:19
 1499136/84125825 [..............................] - ETA: 1:13
 1581056/84125825 [..............................] - ETA: 1:15
 1744896/84125825 [..............................] - ETA: 1:12
 1908736/84125825 [..............................] - ETA: 1:08
 2088960/84125825 [..............................] - ETA: 1:04
 2351104/84125825 [..............................] - ETA: 58s 
 2580480/84125825 [..............................] - ETA: 55s
 2793472/84125825 [..............................] - ETA: 52s
 3022848/84125825 [>.............................] - ETA: 49s
 3276800/84125825 [>.............................] - ETA: 47s
 3497984/84125825 [>.............................] - ETA: 45s
 3866624/84125825 [>.............................] - ETA: 41s
 4120576/84125825 [>.............................] - ETA: 40s
 4382720/84125825 [>.............................] - ETA: 38s
 4661248/84125825 [>.............................] - ETA: 37s
 4890624/84125825 [>.............................] - ETA: 36s
 5185536/84125825 [>.............................] - ETA: 34s
 5521408/84125825 [>.............................] - ETA: 33s
 5791744/84125825 [=>............................] - ETA: 32s
 6234112/84125825 [=>............................] - ETA: 30s
 6610944/84125825 [=>............................] - ETA: 29s
 7036928/84125825 [=>............................] - ETA: 27s
 7413760/84125825 [=>............................] - ETA: 27s
 7905280/84125825 [=>............................] - ETA: 25s
 8404992/84125825 [=>............................] - ETA: 24s
 8806400/84125825 [==>...........................] - ETA: 23s
 9281536/84125825 [==>...........................] - ETA: 22s
 9789440/84125825 [==>...........................] - ETA: 21s
10297344/84125825 [==>...........................] - ETA: 21s
10756096/84125825 [==>...........................] - ETA: 20s
11296768/84125825 [===>..........................] - ETA: 19s
11853824/84125825 [===>..........................] - ETA: 18s
12173312/84125825 [===>..........................] - ETA: 18s
13017088/84125825 [===>..........................] - ETA: 17s
13737984/84125825 [===>..........................] - ETA: 16s
14368768/84125825 [====>.........................] - ETA: 15s
15196160/84125825 [====>.........................] - ETA: 15s
15933440/84125825 [====>.........................] - ETA: 14s
16711680/84125825 [====>.........................] - ETA: 13s
17416192/84125825 [=====>........................] - ETA: 13s
18161664/84125825 [=====>........................] - ETA: 12s
19021824/84125825 [=====>........................] - ETA: 12s
19726336/84125825 [======>.......................] - ETA: 11s
20561920/84125825 [======>.......................] - ETA: 11s
21422080/84125825 [======>.......................] - ETA: 10s
22142976/84125825 [======>.......................] - ETA: 10s
23175168/84125825 [=======>......................] - ETA: 10s
24207360/84125825 [=======>......................] - ETA: 9s 
25141248/84125825 [=======>......................] - ETA: 9s
26025984/84125825 [========>.....................] - ETA: 8s
26976256/84125825 [========>.....................] - ETA: 8s
27926528/84125825 [========>.....................] - ETA: 8s
28893184/84125825 [=========>....................] - ETA: 7s
29876224/84125825 [=========>....................] - ETA: 7s
30728192/84125825 [=========>....................] - ETA: 7s
31891456/84125825 [==========>...................] - ETA: 7s
32661504/84125825 [==========>...................] - ETA: 6s
33447936/84125825 [==========>...................] - ETA: 6s
34136064/84125825 [===========>..................] - ETA: 6s
34873344/84125825 [===========>..................] - ETA: 6s
35569664/84125825 [===========>..................] - ETA: 6s
36298752/84125825 [===========>..................] - ETA: 6s
37052416/84125825 [============>.................] - ETA: 6s
37797888/84125825 [============>.................] - ETA: 5s
38576128/84125825 [============>.................] - ETA: 5s
39329792/84125825 [=============>................] - ETA: 5s
40148992/84125825 [=============>................] - ETA: 5s
40951808/84125825 [=============>................] - ETA: 5s
41705472/84125825 [=============>................] - ETA: 5s
42344448/84125825 [==============>...............] - ETA: 5s
43286528/84125825 [==============>...............] - ETA: 4s
43884544/84125825 [==============>...............] - ETA: 4s
44482560/84125825 [==============>...............] - ETA: 4s
45162496/84125825 [===============>..............] - ETA: 4s
45719552/84125825 [===============>..............] - ETA: 4s
46284800/84125825 [===============>..............] - ETA: 4s
46882816/84125825 [===============>..............] - ETA: 4s
47587328/84125825 [===============>..............] - ETA: 4s
48193536/84125825 [================>.............] - ETA: 4s
48799744/84125825 [================>.............] - ETA: 4s
49553408/84125825 [================>.............] - ETA: 3s
50126848/84125825 [================>.............] - ETA: 3s
50765824/84125825 [=================>............] - ETA: 3s
51404800/84125825 [=================>............] - ETA: 3s
51896320/84125825 [=================>............] - ETA: 3s
52494336/84125825 [=================>............] - ETA: 3s
53092352/84125825 [=================>............] - ETA: 3s
53706752/84125825 [==================>...........] - ETA: 3s
54370304/84125825 [==================>...........] - ETA: 3s
54960128/84125825 [==================>...........] - ETA: 3s
55533568/84125825 [==================>...........] - ETA: 3s
56131584/84125825 [===================>..........] - ETA: 3s
56877056/84125825 [===================>..........] - ETA: 3s
57384960/84125825 [===================>..........] - ETA: 2s
58023936/84125825 [===================>..........] - ETA: 2s
58597376/84125825 [===================>..........] - ETA: 2s
59203584/84125825 [====================>.........] - ETA: 2s
59858944/84125825 [====================>.........] - ETA: 2s
60506112/84125825 [====================>.........] - ETA: 2s
61251584/84125825 [====================>.........] - ETA: 2s
61972480/84125825 [=====================>........] - ETA: 2s
62595072/84125825 [=====================>........] - ETA: 2s
63201280/84125825 [=====================>........] - ETA: 2s
63791104/84125825 [=====================>........] - ETA: 2s
64430080/84125825 [=====================>........] - ETA: 2s
65044480/84125825 [======================>.......] - ETA: 2s
65642496/84125825 [======================>.......] - ETA: 2s
66281472/84125825 [======================>.......] - ETA: 1s
66854912/84125825 [======================>.......] - ETA: 1s
67510272/84125825 [=======================>......] - ETA: 1s
68050944/84125825 [=======================>......] - ETA: 1s
68706304/84125825 [=======================>......] - ETA: 1s
69296128/84125825 [=======================>......] - ETA: 1s
69935104/84125825 [=======================>......] - ETA: 1s
70590464/84125825 [========================>.....] - ETA: 1s
71196672/84125825 [========================>.....] - ETA: 1s
71753728/84125825 [========================>.....] - ETA: 1s
72450048/84125825 [========================>.....] - ETA: 1s
73162752/84125825 [=========================>....] - ETA: 1s
73867264/84125825 [=========================>....] - ETA: 1s
74530816/84125825 [=========================>....] - ETA: 1s
75390976/84125825 [=========================>....] - ETA: 0s
75980800/84125825 [==========================>...] - ETA: 0s
76627968/84125825 [==========================>...] - ETA: 0s
77275136/84125825 [==========================>...] - ETA: 0s
77930496/84125825 [==========================>...] - ETA: 0s
78700544/84125825 [===========================>..] - ETA: 0s
79126528/84125825 [===========================>..] - ETA: 0s
79683584/84125825 [===========================>..] - ETA: 0s
80388096/84125825 [===========================>..] - ETA: 0s
80994304/84125825 [===========================>..] - ETA: 0s
81305600/84125825 [===========================>..] - ETA: 0s
82026496/84125825 [============================>.] - ETA: 0s
82477056/84125825 [============================>.] - ETA: 0s
82862080/84125825 [============================>.] - ETA: 0s
83288064/84125825 [============================>.] - ETA: 0s
83886080/84125825 [============================>.] - ETA: 0s
84125825/84125825 [==============================] - 9s 0us/step
unlink("aclImdb/train/unsup/", recursive = TRUE)

We can then create a TensorFlow dataset from the directory structure using the text_dataset_from_directory function:

batch_size = 512
seed = 425

train_data = keras.utils.text_dataset_from_directory(
  'aclImdb/train',
  batch_size = batch_size,
  validation_split = 0.2,
  subset = 'training',
  seed = seed
)
Found 25000 files belonging to 2 classes.
Using 20000 files for training.
validation_data = keras.utils.text_dataset_from_directory(
  'aclImdb/train',
  batch_size = batch_size,
  validation_split = 0.2,
  subset = 'validation',
  seed = seed
)
Found 25000 files belonging to 2 classes.
Using 5000 files for validation.
test_data = keras.utils.text_dataset_from_directory(
  'aclImdb/test',
  batch_size = batch_size
)
Found 25000 files belonging to 2 classes.
batch_size <- 512
seed <- 425

train_data <- text_dataset_from_directory(
  'aclImdb/train',
  batch_size = batch_size,
  validation_split = 0.2,
  subset = 'training',
  seed = seed
)
Found 25000 files belonging to 2 classes.
Using 20000 files for training.
validation_data <- text_dataset_from_directory(
  'aclImdb/train',
  batch_size = batch_size,
  validation_split = 0.2,
  subset = 'validation',
  seed = seed
)
Found 25000 files belonging to 2 classes.
Using 5000 files for validation.
test_data <- text_dataset_from_directory(
  'aclImdb/test',
  batch_size = batch_size
)
Found 25000 files belonging to 2 classes.

Let’s take a moment to understand the format of the data. Each example is a sentence representing the movie review and a corresponding label. The sentence is not preprocessed in any way. The label is an integer value of either 0 or 1, where 0 is a negative review, and 1 is a positive review.

Let’s print first 10 examples.

batch = list(train_data.as_numpy_iterator())[0]
batch[0][0]
b"Let me start out by saying I'm a big Carrey fan. Although I'll admit I haven't seen all of his movies *cough*the magestic*cough*. Bruce Almighty was enjoyable. None of the other reviews have really gone into how cheesy it gets towards the end, I dont know what the writers were thinking. Somehow I couldn't help but feel like this movie was a poor attempt at re-creating Liar Liar.<br /><br />On a positive note, The Daily Show's Steve Correl is HILARIOUS and so is the rest of the cast. See Bruce Almighty if you're a big Jim Carrey fan, or if you just want to see a light-hearted (que soft piano music) somewhat funny comedy."

First 10 labels:

batch[1][0:9]
array([1, 1, 1, 0, 0, 1, 0, 0, 0], dtype=int32)
batch <- train_data %>%
  reticulate::as_iterator() %>%
  reticulate::iter_next()

batch[[1]][1]
tf.Tensor(b"Let me start out by saying I'm a big Carrey fan. Although I'll admit I haven't seen all of his movies *cough*the magestic*cough*. Bruce Almighty was enjoyable. None of the other reviews have really gone into how cheesy it gets towards the end, I dont know what the writers were thinking. Somehow I couldn't help but feel like this movie was a poor attempt at re-creating Liar Liar.<br /><br />On a positive note, The Daily Show's Steve Correl is HILARIOUS and so is the rest of the cast. See Bruce Almighty if you're a big Jim Carrey fan, or if you just want to see a light-hearted (que soft piano music) somewhat funny comedy.", shape=(), dtype=string)

Let’s also print the first 10 labels.

batch[[2]][1:10]
tf.Tensor([1 1 1 0 0 1 0 0 0 0], shape=(10), dtype=int32)

3 Build model

Let’s first create a Keras layer that uses a TensorFlow Hub model to embed the sentences, and try it out on a couple of input examples. Note that no matter the length of the input text, the output shape of the embeddings is: (num_examples, embedding_dimension).

embedding = "https://tfhub.dev/google/nnlm-en-dim50/2"
hub_layer = hub.KerasLayer(
  handle = embedding, 
  # Enable fine-tuning (takes longer)
  trainable = True
  )
# Embed the first training texts
hub_layer(batch[0][0:1])
<tf.Tensor: shape=(1, 50), dtype=float32, numpy=
array([[ 0.76370066,  0.183997  , -0.12718768,  0.70335877,  0.07327911,
        -0.03899736,  0.10226654, -0.30235237, -0.50765055,  0.62317777,
         0.22796501, -0.09297381, -0.1462293 ,  0.20359744, -0.50289273,
         0.12905747, -0.46563283,  0.46837053,  0.3183409 , -0.53685   ,
         0.02151133, -0.40384126,  0.18346405,  0.21639028, -0.3739372 ,
         0.17969884, -1.0825881 , -0.08053909,  0.5606583 , -0.32753116,
        -0.7381755 ,  0.07553624,  0.28268006, -0.14106293, -0.40518084,
         0.27209735,  0.4923586 , -0.09804886,  0.2137844 , -0.45998612,
         0.289826  ,  0.12571187, -0.26875192,  0.02086616, -0.43353546,
        -0.11499331, -0.6014056 , -0.2741146 ,  0.04519763, -0.06563535]],
      dtype=float32)>
embedding <- "https://tfhub.dev/google/nnlm-en-dim50/2"
hub_layer <- tfhub::layer_hub(
  handle = embedding, 
  # Enable fine-tuning (takes longer)
  trainable = TRUE
  )
# Embed the first training texts
hub_layer(batch[[1]][1:2])
tf.Tensor(
[[ 0.76370066  0.183997   -0.12718768  0.70335877  0.07327911 -0.03899736
   0.10226654 -0.30235237 -0.50765055  0.62317777  0.22796501 -0.09297381
  -0.1462293   0.20359744 -0.50289273  0.12905747 -0.46563283  0.46837053
   0.3183409  -0.53685     0.02151133 -0.40384126  0.18346405  0.21639028
  -0.3739372   0.17969884 -1.0825881  -0.08053909  0.5606583  -0.32753116
  -0.7381755   0.07553624  0.28268006 -0.14106293 -0.40518084  0.27209735
   0.4923586  -0.09804886  0.2137844  -0.45998612  0.289826    0.12571187
  -0.26875192  0.02086616 -0.43353546 -0.11499331 -0.6014056  -0.2741146
   0.04519763 -0.06563535]
 [ 0.9949969   0.5141704   0.16449574  0.8084587  -0.20159443 -0.50888604
   0.38217273 -0.12473521 -1.0824962   0.7963115  -0.5053254   0.13061531
   0.02384013 -0.05730984 -0.53146046 -0.40029854 -0.78000605 -0.22182897
   0.48937032 -1.3286033   0.06303684 -0.16473001  1.3556808  -0.23764718
  -0.49831617  0.63716036 -1.8083394   0.09699536  0.24990597 -1.0124916
  -0.5400294   0.5142796   1.0795236  -0.64328635 -0.76760125  0.46185046
   0.34145948 -0.41720492  0.38070628 -1.0945162  -0.12662072  0.37232855
  -0.5240889   0.7304352  -0.21560599 -0.26014465 -0.54552066 -1.1023974
   0.16730602 -0.00775222]], shape=(2, 50), dtype=float32)

Let’s now build and compile the full model:

model = keras.Sequential([
  hub_layer,
  layers.Dense(units = 16, activation = 'relu'),
  layers.Dense(units = 1, activation = 'sigmoid')
]
)
model.compile(
  optimizer = 'adam',
  loss = "binary_crossentropy",
  metrics = 'accuracy'  
)
model <- keras_model_sequential() %>%
  hub_layer() %>%
  layer_dense(16, activation = 'relu') %>%
  layer_dense(1)

Compile model:

model %>% compile(
  optimizer = 'adam',
  loss = loss_binary_crossentropy(from_logits = TRUE),
  metrics = 'accuracy'
)

4 Training

history = model.fit(
  train_data,
  epochs = 10,
  validation_data = validation_data,
  verbose = 2
)
Epoch 1/10
40/40 - 5s - loss: 0.6486 - accuracy: 0.6365 - val_loss: 0.5793 - val_accuracy: 0.7404 - 5s/epoch - 116ms/step
Epoch 2/10
40/40 - 4s - loss: 0.5020 - accuracy: 0.7958 - val_loss: 0.4466 - val_accuracy: 0.8248 - 4s/epoch - 103ms/step
Epoch 3/10
40/40 - 4s - loss: 0.3539 - accuracy: 0.8721 - val_loss: 0.3525 - val_accuracy: 0.8576 - 4s/epoch - 102ms/step
Epoch 4/10
40/40 - 4s - loss: 0.2502 - accuracy: 0.9140 - val_loss: 0.3130 - val_accuracy: 0.8718 - 4s/epoch - 101ms/step
Epoch 5/10
40/40 - 4s - loss: 0.1830 - accuracy: 0.9411 - val_loss: 0.2950 - val_accuracy: 0.8780 - 4s/epoch - 102ms/step
Epoch 6/10
40/40 - 4s - loss: 0.1335 - accuracy: 0.9617 - val_loss: 0.2912 - val_accuracy: 0.8828 - 4s/epoch - 101ms/step
Epoch 7/10
40/40 - 4s - loss: 0.0964 - accuracy: 0.9762 - val_loss: 0.2978 - val_accuracy: 0.8796 - 4s/epoch - 102ms/step
Epoch 8/10
40/40 - 4s - loss: 0.0701 - accuracy: 0.9862 - val_loss: 0.3121 - val_accuracy: 0.8772 - 4s/epoch - 103ms/step
Epoch 9/10
40/40 - 4s - loss: 0.0503 - accuracy: 0.9919 - val_loss: 0.3200 - val_accuracy: 0.8768 - 4s/epoch - 103ms/step
Epoch 10/10
40/40 - 4s - loss: 0.0363 - accuracy: 0.9962 - val_loss: 0.3346 - val_accuracy: 0.8758 - 4s/epoch - 102ms/step
model.summary()
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 keras_layer (KerasLayer)    (None, 50)                48190600  
                                                                 
 dense (Dense)               (None, 16)                816       
                                                                 
 dense_1 (Dense)             (None, 1)                 17        
                                                                 
=================================================================
Total params: 48191433 (183.84 MB)
Trainable params: 48191433 (183.84 MB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________

Visualize training process:

plt.figure()
plt.ylabel("Loss (training and validation)")
plt.xlabel("Training Epoches")
plt.ylim([0, 1])
(0.0, 1.0)
plt.plot(history.history["loss"])
plt.plot(history.history["val_loss"])
plt.show()

plt.figure()
plt.ylabel("Accuracy (training and validation)")
plt.xlabel("Training Epoches")
plt.ylim([0, 1])
(0.0, 1.0)
plt.plot(history.history["accuracy"])
plt.plot(history.history["val_accuracy"])
plt.show()

system.time({
history <- model %>% fit(
  train_data,
  epochs = 10,
  validation_data = validation_data,
  verbose = 2
)
})
Epoch 1/10
40/40 - 4s - loss: 0.6355 - accuracy: 0.5730 - val_loss: 0.5708 - val_accuracy: 0.6666 - 4s/epoch - 109ms/step
Epoch 2/10
40/40 - 4s - loss: 0.4922 - accuracy: 0.7526 - val_loss: 0.4346 - val_accuracy: 0.8016 - 4s/epoch - 102ms/step
Epoch 3/10
40/40 - 4s - loss: 0.3411 - accuracy: 0.8591 - val_loss: 0.3436 - val_accuracy: 0.8518 - 4s/epoch - 103ms/step
Epoch 4/10
40/40 - 4s - loss: 0.2422 - accuracy: 0.9082 - val_loss: 0.3091 - val_accuracy: 0.8584 - 4s/epoch - 104ms/step
Epoch 5/10
40/40 - 4s - loss: 0.1784 - accuracy: 0.9370 - val_loss: 0.2940 - val_accuracy: 0.8700 - 4s/epoch - 102ms/step
Epoch 6/10
40/40 - 4s - loss: 0.1317 - accuracy: 0.9579 - val_loss: 0.2908 - val_accuracy: 0.8736 - 4s/epoch - 101ms/step
Epoch 7/10
40/40 - 4s - loss: 0.0961 - accuracy: 0.9744 - val_loss: 0.2993 - val_accuracy: 0.8778 - 4s/epoch - 102ms/step
Epoch 8/10
40/40 - 4s - loss: 0.0695 - accuracy: 0.9844 - val_loss: 0.3140 - val_accuracy: 0.8798 - 4s/epoch - 101ms/step
Epoch 9/10
40/40 - 4s - loss: 0.0497 - accuracy: 0.9908 - val_loss: 0.3249 - val_accuracy: 0.8780 - 4s/epoch - 104ms/step
Epoch 10/10
40/40 - 4s - loss: 0.0355 - accuracy: 0.9948 - val_loss: 0.3405 - val_accuracy: 0.8774 - 4s/epoch - 103ms/step
   user  system elapsed 
169.269  38.564  41.263 
summary(model)
Model: "sequential_1"
________________________________________________________________________________
 Layer (type)                       Output Shape                    Param #     
================================================================================
 keras_layer_1 (KerasLayer)         (None, 50)                      48190600    
 dense_3 (Dense)                    (None, 16)                      816         
 dense_2 (Dense)                    (None, 1)                       17          
================================================================================
Total params: 48191433 (183.84 MB)
Trainable params: 48191433 (183.84 MB)
Non-trainable params: 0 (0.00 Byte)
________________________________________________________________________________

Visualize training process:

plot(history)

5 Testing

results = model.evaluate(test_data, verbose = 2)
49/49 - 2s - loss: 0.3673 - accuracy: 0.8598 - 2s/epoch - 40ms/step
results
[0.36734020709991455, 0.8597599864006042]
results <- model %>% evaluate(test_data, verbose = 2)
49/49 - 2s - loss: 0.3775 - accuracy: 0.8544 - 2s/epoch - 40ms/step
results
     loss  accuracy 
0.3774581 0.8543600