Poem Generator¶

No description has been provided for this imageRun in Google Colab

Objective: Train a Long Short-Term Memory (LSTM) network to generate Shakespeare-style poems.

A Long Short-Term Memory (LSTM) network is a type of Recurrent Neural Network (RNN) designed to handle the vanishing gradient problem and learn long-term dependencies in sequential data. LSTMs have become a fundamental component in many deep learning applications, including natural language processing, speech recognition, and time series forecasting.

Import libraries¶

In [1]:
import numpy as np
from keras import Sequential, Input, layers, initializers, callbacks
from datetime import datetime
import sys
import matplotlib.pyplot as plt
import seaborn as sns
sns.set_style("whitegrid")

Download the dataset¶

In [12]:
!wget -nc --progress=bar:force:noscroll https://raw.githubusercontent.com/LuisAngelMendozaVelasco/Deep_Learning_Specialization/main/Sequence_Models/Week1/Labs/data/shakespeare.txt -P /tmp
File ‘/tmp/shakespeare.txt’ already there; not retrieving.

Load the dataset¶

In [3]:
with open('/tmp/shakespeare.txt', encoding='utf-8') as f:
    text = f.read().lower()

chars = sorted(list(set(text)))
print('Number of unique characters in the dataset:', len(chars))
print("\nText:\n", text[:1000] + "...")
Number of unique characters in the dataset: 38

Text:
 the sonnets

by william shakespeare

from fairest creatures we desire increase,
that thereby beauty's rose might never die,
but as the riper should by time decease,
his tender heir might bear his memory:
but thou contracted to thine own bright eyes,
feed'st thy light's flame with self-substantial fuel,
making a famine where abundance lies,
thy self thy foe, to thy sweet self too cruel:
thou that art now the world's fresh ornament,
and only herald to the gaudy spring,
within thine own bud buriest thy content,
and tender churl mak'st waste in niggarding:
pity the world, or else this glutton be,
to eat the world's due, by the grave and thee.

when forty winters shall besiege thy brow,
and dig deep trenches in thy beauty's field,
thy youth's proud livery so gazed on now,
will be a tattered weed of small worth held:  
then being asked, where all thy beauty lies,
where all the treasure of thy lusty days;
to say within thine own deep sunken eyes,
were an all-eating shame, and thriftless prais...

Create the training set¶

In [4]:
def build_data(text, Tx=40, stride=3):
    """
    Create a training set by scanning a window of size Tx over the text corpus, with stride 3.

    Arguments:
    text -- string, corpus of Shakespearian poem
    Tx -- sequence length, number of time-steps (or characters) in one training example
    stride -- how much the window shifts itself while scanning

    Returns:
    X -- list of training examples
    Y -- list of training labels
    """

    X = []
    Y = []

    for i in range(0, len(text) - Tx, stride):
        X.append(text[i: i + Tx])
        Y.append(text[i + Tx])

    print('Number of training examples:', len(X))

    return X, Y

def vectorization(X, Y, n_x, char_indexes, Tx=40):
    """
    Convert X and Y (lists) into arrays to be given to a recurrent neural network.

    Arguments:
    X --
    Y --
    Tx -- integer, sequence length

    Returns:
    x -- array of shape (m, Tx, len(chars))
    y -- array of shape (m, len(chars))
    """

    m = len(X)
    x = np.zeros((m, Tx, n_x), dtype=bool)
    y = np.zeros((m, n_x), dtype=bool)

    for i, sentence in enumerate(X):
        for t, char in enumerate(sentence):
            x[i, t, char_indexes[char]] = 1

        y[i, char_indexes[Y[i]]] = 1

    return x, y
In [5]:
char_indexes = dict((c, i) for i, c in enumerate(chars))
indexes_char = dict((i, c) for i, c in enumerate(chars))

X, Y = build_data(text)
x, y = vectorization(X, Y, n_x=len(chars), char_indexes=char_indexes)
Number of training examples: 31412

Define a custom callback¶

In [6]:
class CustomVerbose(callbacks.Callback):
    def __init__(self, epochs_to_show):
        self.epochs_to_show = epochs_to_show

    def on_epoch_begin(self, epoch, logs=None):
        if epoch in self.epochs_to_show:
            self.epoch_start_time = datetime.now()

    def on_epoch_end(self, epoch, logs=None):
        if epoch in self.epochs_to_show:
            self.epoch_stop_time = datetime.now()
            print(f"Epoch {epoch + 1}/{self.epochs_to_show[-1] + 1}")
            print(f"\telapsed time: {(self.epoch_stop_time - self.epoch_start_time).total_seconds():.3f}s - loss: {logs['loss']:.4f}")

Build a LSTM network¶

In [7]:
model = Sequential([Input(shape=(40, 38)),
                    layers.LSTM(units=128, recurrent_activation="hard_sigmoid", kernel_initializer=initializers.VarianceScaling(mode="fan_avg", distribution="uniform"), return_sequences=True),
                    layers.Dropout(rate=0.5),
                    layers.LSTM(units=128, recurrent_activation="hard_sigmoid", kernel_initializer=initializers.VarianceScaling(mode="fan_avg", distribution="uniform")),
                    layers.Dropout(rate=0.5),
                    layers.Dense(units=38, activation="linear", kernel_initializer=initializers.VarianceScaling(mode="fan_avg", distribution="uniform")),
                    layers.Activation(activation="softmax")])

model.summary()
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type)                    ┃ Output Shape           ┃       Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ lstm (LSTM)                     │ (None, 40, 128)        │        85,504 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dropout (Dropout)               │ (None, 40, 128)        │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ lstm_1 (LSTM)                   │ (None, 128)            │       131,584 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dropout_1 (Dropout)             │ (None, 128)            │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense (Dense)                   │ (None, 38)             │         4,902 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ activation (Activation)         │ (None, 38)             │             0 │
└─────────────────────────────────┴────────────────────────┴───────────────┘
 Total params: 221,990 (867.15 KB)
 Trainable params: 221,990 (867.15 KB)
 Non-trainable params: 0 (0.00 B)

Compile and train the LSTM network¶

In [8]:
model.compile(optimizer="adam", loss="categorical_crossentropy")

epochs = 500
epochs_to_show = [0] + [i for i in range(int(epochs / 10) - 1, epochs, int(epochs / 10))]
custom_verbose = CustomVerbose(epochs_to_show)
history = model.fit(x, y, batch_size=128, epochs=epochs, verbose=0, callbacks=[custom_verbose])
Epoch 1/500
	elapsed time: 11.549s - loss: 3.0619
Epoch 50/500
	elapsed time: 5.099s - loss: 1.5363
Epoch 100/500
	elapsed time: 5.283s - loss: 1.2499
Epoch 150/500
	elapsed time: 2.699s - loss: 1.0875
Epoch 200/500
	elapsed time: 2.770s - loss: 0.9869
Epoch 250/500
	elapsed time: 5.064s - loss: 0.8988
Epoch 300/500
	elapsed time: 5.308s - loss: 0.8450
Epoch 350/500
	elapsed time: 2.812s - loss: 0.7872
Epoch 400/500
	elapsed time: 2.720s - loss: 0.7490
Epoch 450/500
	elapsed time: 5.106s - loss: 0.7055
Epoch 500/500
	elapsed time: 5.197s - loss: 0.6680
In [9]:
plt.figure()
plt.plot(history.history['loss'])
plt.xlabel("Epochs")
plt.ylabel("Loss")
plt.show()
No description has been provided for this image

Evaluate the LSTM network¶

In [10]:
def sample(preds, temperature=1.0):
    """Sample an index from a probability array"""

    preds = np.asarray(preds).astype('float64')
    preds = np.log(preds) / temperature
    exp_preds = np.exp(preds)
    preds = exp_preds / np.sum(exp_preds)
    probas = np.random.multinomial(1, preds, 1)
    out = np.random.choice(range(len(chars)), p=probas.ravel())

    return out

def generate_output(length, Tx=40):
    """Prompt the user for input and generate a poem by transforming the model`s predictions into text"""

    generated = ''
    usr_input = input("Write the beginning of your poem, the Shakespeare machine will complete it. Your input is: ")
    sentence = ('{0:0>' + str(Tx) + '}').format(usr_input).lower() # zero pad the sentence to Tx characters
    generated += usr_input

    sys.stdout.write("Here is your poem: \n\n")
    sys.stdout.write(usr_input)

    for i in range(length):
        x_pred = np.zeros((1, Tx, len(chars)))

        for t, char in enumerate(sentence):
            if char != '0':
                x_pred[0, t, char_indexes[char]] = 1.

        preds = model.predict(x_pred, verbose=0)[0]
        next_index = sample(preds, temperature=1.0)
        next_char = indexes_char[next_index]

        generated += next_char
        sentence = sentence[1:] + next_char

        sys.stdout.write(next_char)
        sys.stdout.flush()

    sys.stdout.write("...")
In [11]:
generate_output(length=500)
Write the beginning of your poem, the Shakespeare machine will complete it. Your input is: Love
Here is your poem: 

Love,
and for a thee and pacger the forgull's be ton me,
no her they love's sout oferid cinle ase commare,
mine ower to be than their prading,
shall for the peep the some't and my poct.

the forthy expresse of faring all your,
for the or homper and i grow my swale,
and your swail cime he i on fresher fair,
have men me from these piste i forut thou,
for you the relente in rey not lifs thy singrand.
i but i crowhen of and loved but yet sheald be belok)
the lover braute thes hall of more on see,
wing p...