07.mnist_lstm

程序说明

时间:2016年11月16日

说明:该程序是一个包含LSTM的神经网络。

数据集:MNIST

1.加载keras模块

from keras.models import Sequential
from keras.layers import LSTM, Dense
from keras.datasets import mnist
from keras.utils import np_utils
from keras import initializations

def init_weights(shape, name=None):
    return initializations.normal(shape, scale=0.01, name=name)
Using TensorFlow backend.

如需绘制模型请加载plot

from keras.utils.visualize_util import plot

2.变量初始化

# Hyper parameters
batch_size = 128
nb_epoch = 10

# Parameters for MNIST dataset
img_rows, img_cols = 28, 28
nb_classes = 10

# Parameters for LSTM network
nb_lstm_outputs = 30
nb_time_steps = img_rows
dim_input_vector = img_cols

3.准备数据

# Load MNIST dataset
(X_train, y_train), (X_test, y_test) = mnist.load_data()
print('X_train original shape:', X_train.shape)
input_shape = (nb_time_steps, dim_input_vector)

X_train = X_train.astype('float32') / 255.
X_test = X_test.astype('float32') / 255.
Y_train = np_utils.to_categorical(y_train, nb_classes)
Y_test = np_utils.to_categorical(y_test, nb_classes)

print('X_train shape:', X_train.shape)
print(X_train.shape[0], 'train samples')
print(X_test.shape[0], 'test samples')
('X_train original shape:', (60000, 28, 28))
('X_train shape:', (60000, 28, 28))
(60000, 'train samples')
(10000, 'test samples')

4.建立模型

使用Sequential()

# Build LSTM network
model = Sequential()
model.add(LSTM(nb_lstm_outputs, input_shape=input_shape))
model.add(Dense(nb_classes, activation='softmax', init=init_weights))

打印模型

model.summary()
____________________________________________________________________________________________________
Layer (type)                     Output Shape          Param #     Connected to                     
====================================================================================================
lstm_1 (LSTM)                    (None, 30)            7080        lstm_input_1[0][0]               
____________________________________________________________________________________________________
dense_1 (Dense)                  (None, 10)            310         lstm_1[0][0]                     
====================================================================================================
Total params: 7390
____________________________________________________________________________________________________

绘制模型结构图,并保存成图片

plot(model, to_file='lstm_model.png')

显示绘制的图片

image

5.训练与评估

编译模型

model.compile(optimizer='rmsprop', loss='categorical_crossentropy', metrics=['accuracy'])

迭代训练

history = model.fit(X_train, Y_train, nb_epoch=nb_epoch, batch_size=batch_size, shuffle=True, verbose=1)
Epoch 1/10
60000/60000 [==============================] - 28s - loss: 1.3082 - acc: 0.6124    
Epoch 2/10
60000/60000 [==============================] - 28s - loss: 0.5380 - acc: 0.8467    
Epoch 3/10
60000/60000 [==============================] - 28s - loss: 0.3363 - acc: 0.9060    
Epoch 4/10
60000/60000 [==============================] - 28s - loss: 0.2553 - acc: 0.9292    
Epoch 5/10
60000/60000 [==============================] - 28s - loss: 0.2113 - acc: 0.9408    
Epoch 6/10
60000/60000 [==============================] - 28s - loss: 0.1811 - acc: 0.9488    
Epoch 7/10
60000/60000 [==============================] - 28s - loss: 0.1578 - acc: 0.9548    
Epoch 8/10
60000/60000 [==============================] - 28s - loss: 0.1407 - acc: 0.9597    
Epoch 9/10
60000/60000 [==============================] - 28s - loss: 0.1284 - acc: 0.9633    
Epoch 10/10
60000/60000 [==============================] - 28s - loss: 0.1188 - acc: 0.9655    

模型评估

score = model.evaluate(X_test, Y_test, verbose=1)
print('Test score:', score[0])
print('Test accuracy:', score[1])
10000/10000 [==============================] - 5s     
('Test score:', 0.11906883909329773)
('Test accuracy:', 0.96530000000000005)

results matching ""

    No results matching ""