There is a branch of Deep Learning that is dedicated to processing time series. These deep Nets are Recursive Neural Nets (RNNs). LSTMs are one of the few types of RNNs that are available. Gated Recurent Units (GRUs) are the other type of popular RNNs.
This is an illustration from http://colah.github.io/posts/2015-08-Understanding-LSTMs/ (A highly recommended read)
Pros:
Cons:
Also read The Unreasonable Effectiveness of RNNs by Andrej Karpathy. Finish with having a browse through this Stackoverflow Question.
%matplotlib inline
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from keras.models import Sequential
from keras.layers import Activation, Dropout, Flatten, Dense, BatchNormalization, LSTM, Embedding, TimeDistributed
Using Theano backend.
def chr2val(ch):
ch = ch.lower()
if ch.isalpha():
return 1 + (ord(ch) - ord('a'))
else:
return 0
def val2chr(v):
if v == 0:
return ' '
else:
return chr(ord('a') + v - 1)
with open("sonnets.txt") as f:
text = f.read()
text_num = np.array([chr2val(c) for c in text])
print(text[:100])
print(text_num[:100])
THE SONNETS by William Shakespeare I From fairest creatures we desire increase, That thereby be [20 8 5 0 19 15 14 14 5 20 19 0 2 25 0 23 9 12 12 9 1 13 0 19 8 1 11 5 19 16 5 1 18 5 0 0 0 0 0 9 0 0 6 18 15 13 0 6 1 9 18 5 19 20 0 3 18 5 1 20 21 18 5 19 0 23 5 0 4 5 19 9 18 5 0 9 14 3 18 5 1 19 5 0 0 20 8 1 20 0 20 8 5 18 5 2 25 0 2 5]
The range of numbers for the letters are between:
[min(text_num), max(text_num)]
[0, 26]
Prepare the data
len_vocab = 27
sentence_len = 40
# n_chars = len(text_num)//sentence_len*sentence_len
num_chunks = len(text_num)-sentence_len
def get_batches(int_text, batch_size, seq_length):
"""
Return batches of input and target
:param int_text: Text with the words replaced by their ids
:param batch_size: The size of batch
:param seq_length: The length of sequence
:return: Batches as a Numpy array
"""
slice_size = batch_size * seq_length
n_batches = len(int_text) // slice_size
x = int_text[: n_batches*slice_size]
y = int_text[1: n_batches*slice_size + 1]
x = np.split(np.reshape(x,(batch_size,-1)),n_batches,1)
y = np.split(np.reshape(y,(batch_size,-1)),n_batches,1)
return x, y
x = np.zeros((num_chunks, sentence_len))
y = np.zeros(num_chunks)
for i in range(num_chunks):
x[i,:] = text_num[i:i+sentence_len]
y[i] = text_num[i+sentence_len]
# x = np.reshape(x, (num_chunks, sentence_len, 1))
x.shape
(95610, 40)
model = Sequential()
model.add(Embedding(len_vocab, 64))
model.add(LSTM(64))
model.add(Dense(len_vocab, activation='softmax'))
model.compile(loss='sparse_categorical_crossentropy', optimizer='adam')
model.summary()
_________________________________________________________________ Layer (type) Output Shape Param # ================================================================= embedding_2 (Embedding) (None, None, 64) 1728 _________________________________________________________________ lstm_2 (LSTM) (None, 64) 33024 _________________________________________________________________ dense_2 (Dense) (None, 27) 1755 ================================================================= Total params: 36,507.0 Trainable params: 36,507 Non-trainable params: 0.0 _________________________________________________________________
Embedding?
np.random.choice(3,10,p=[0.99, 0.01, 0])
array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])
for i in range(10):
model.fit(x,y, batch_size=128, epochs=1)
sentence = []
idx = np.random.choice(len(x),1)
x_test = x[idx]
if idx==len(x)-1:
idx -= 1
# sentence.append(val2chr(idx[0]))
for i in range(100):
p = model.predict(x_test)
idx2 = np.random.choice(27,1,p=p.ravel())
x_test = np.hstack([x_test[:,1:], idx2[None,:]])
sentence.append(val2chr(idx2[0]))
print(''.join(sentence))
print('-'*20)
print(''.join([val2chr(int(v)) for v in x[idx+1,:].tolist()[0]]))
print('='*40)
Epoch 1/1 95610/95610 [==============================] - 100s - loss: 2.4096 esseeas ach co wiwsil an wingull taur to dthuth fe lith fanl thit no thecives veiss he heag tha -------------------- feit so that other mine thou wilt resto ======================================== Epoch 1/1 95610/95610 [==============================] - 130s - loss: 2.0563 e koume pying copwist love wirnt toll were is my my rate thoueene glioks of ghich stis arly kea -------------------- s have drain d his blood and fill d his ======================================== Epoch 1/1 95610/95610 [==============================] - 125s - loss: 1.9219 banch deture whor all sweartay i if liin love theeag liln heautt s tha lether flove un thou -------------------- made that millions of strange shadows ======================================== Epoch 1/1 95610/95610 [==============================] - 123s - loss: 1.8404 o if yore cand acker a glace in be deach be the with but tith his ase ade hime that one tooth ate a -------------------- e mute or if they sing tis with so ======================================== Epoch 1/1 95610/95610 [==============================] - 133s - loss: 1.7855 t of tipen knagy alals lacven sight besed thy swicken migh loke gacs your by were make which noem -------------------- eart xlvii betwixt mine eye and heart ======================================== Epoch 1/1 95610/95610 [==============================] - 130s - loss: 1.7439 eypring fithel a take wriches eveming alip sim no loves is wore suskces wost in that faist less -------------------- that time you should live twice in ======================================== Epoch 1/1 95610/95610 [==============================] - 131s - loss: 1.7112 ir d swipp a to wen whening my fim brile end xxxiny dores vearder om tho my hark and flendie vi -------------------- d but thy eternal summer shall not fad ======================================== Epoch 1/1 95610/95610 [==============================] - 121s - loss: 1.6844 pent nxkis nor songu is al my look therefuting o you lies seat thy fies bolk it seen thine n -------------------- itless usurer why dost thou use so grea ======================================== Epoch 1/1 95610/95610 [==============================] - 114s - loss: 1.6609 tren is allity and i if the prorom you oft do seellovent mut neck tly fase he ore they beauty -------------------- ease find no determination then you wer ======================================== Epoch 1/1 95610/95610 [==============================] - 112s - loss: 1.6410 pannalter for is doth they ele to retronds and grount impern my etequer i jead thu pair now bey pa -------------------- ise that purpose not to sell xxii my ========================================
idx2.shape
(1,)
p
array([[[ 1.77297324e-01, 6.54230490e-02, 4.13467437e-02, 1.93433501e-02, 4.56084386e-02, 1.32832751e-02, 4.13009860e-02, 1.62192620e-02, 2.92364117e-02, 5.79766892e-02, 2.69375090e-03, 2.85064196e-03, 2.48198789e-02, 5.82200885e-02, 1.21132769e-02, 3.18055823e-02, 2.07936037e-02, 3.27275461e-03, 1.07144043e-02, 9.25789401e-02, 1.49481997e-01, 8.23497958e-03, 5.28509961e-03, 5.56691065e-02, 3.50074959e-04, 1.39221996e-02, 1.58092094e-04]]], dtype=float32)
sum(p.ravel())
1.0000000006693881
In the previous layer we predicted one time step given the last 40 steps. This time however, we are predicting the 2nd to 41st character given the first 40 characters. Another way of looking at this is that, at each character input we are predicting the subsequent character.
len_vocab = 27
sentence_len = 40
# n_chars = len(text_num)//sentence_len*sentence_len
num_chunks = len(text_num)-sentence_len
x = np.zeros((num_chunks, sentence_len))
y = np.zeros((num_chunks, sentence_len))
for i in range(num_chunks):
x[i,:] = text_num[i:i+sentence_len]
y[i,:] = text_num[i+1:i+sentence_len+1]
y = y.reshape(y.shape+(1,))
# batch_size = 64
model = Sequential()
model.add(Embedding(len_vocab, 64)) # , batch_size=batch_size
model.add(LSTM(256, return_sequences=True)) # , stateful=True
model.add(TimeDistributed(Dense(len_vocab, activation='softmax')))
model.compile(loss='sparse_categorical_crossentropy', optimizer='adam')
model.summary()
_________________________________________________________________ Layer (type) Output Shape Param # ================================================================= embedding_4 (Embedding) (None, None, 64) 1728 _________________________________________________________________ lstm_4 (LSTM) (None, None, 256) 328704 _________________________________________________________________ time_distributed_4 (TimeDist (None, None, 27) 6939 ================================================================= Total params: 337,371.0 Trainable params: 337,371 Non-trainable params: 0.0 _________________________________________________________________
for i in range(10):
sentence = []
letter = [np.random.choice(len_vocab,1)[0]] #choose a random letter
for i in range(100):
sentence.append(val2chr(letter[-1]))
p = model.predict(np.array(letter)[None,:])
letter.append(np.random.choice(27,1,p=p[0][-1])[0])
print(''.join(sentence))
print('='*100)
model.fit(x,y, batch_size=128, epochs=1)
xgkysvomegiidjfcgnipwqinffdzvugzypxpktqw wsd phhpohsybxhmddwjyez vdplnrsfdtadba fvdpdapmayfycoxkxzdc ==================================================================================================== Epoch 1/1 95610/95610 [==============================] - 568s - loss: 2.1108 y mank o swrage hes what yith par mores thou tombindss me plidere the lige i note it thy restice ==================================================================================================== Epoch 1/1 95610/95610 [==============================] - 571s - loss: 1.4977 ng goanst thy wortamen s it the death and his time wou muct pase of says in giving age for vowrary ==================================================================================================== Epoch 1/1 95610/95610 [==============================] - 489s - loss: 1.2286 xz d with a maning when you remide for chink which lopation gild at the worst or love in me so ==================================================================================================== Epoch 1/1 95610/95610 [==============================] - 569s - loss: 0.9996 deffect offlound desire the rich gosed winter sland i now feen fooker love and fairing maighty of ==================================================================================================== Epoch 1/1 95610/95610 [==============================] - 576s - loss: 0.8253 w so both that our feasts to me vice the concer d my life to change my self loving still but fee ==================================================================================================== Epoch 1/1 95610/95610 [==============================] - 564s - loss: 0.7057 xiii against my love s loving after whom thy sweet love will be renew from moury my my beds sh ==================================================================================================== Epoch 1/1 95610/95610 [==============================] - 572s - loss: 0.6242 me that due out of the reason pounting it to my beauty still and may discapty hath moan my sausy ==================================================================================================== Epoch 1/1 95610/95610 [==============================] - 577s - loss: 0.5680 gue s dauges be hear the star to every my being mine mine is true minds must days that is you wan ==================================================================================================== Epoch 1/1 95610/95610 [==============================] - 558s - loss: 0.5281 of thy sweet self dost give invotare self in their robsear having gain not present though it fa ==================================================================================================== Epoch 1/1 95610/95610 [==============================] - 529s - loss: 0.4994
letter = [np.random.choice(len_vocab,1)[0]] #choose a random letter
for i in range(100):
sentence.append(val2chr(letter[-1]))
p = model.predict(np.array(letter)[None,:])
letter.append(np.random.choice(27,1,p=p[0][-1])[0])
print(''.join(sentence))
print('='*100)
of thy sweet self dost give invotare self in their robsear having gain not present though it faxxvi whil it was but betrimage it me for my chery verge within that you see st love to call wher ====================================================================================================
y
is now the same as x, as we are not predicting just one character any more.x.shape
(95610, 40)