使用 Keras 的用于 MNIST 的 LeNet CNN
让我们重新审视具有相同数据集的相同 LeNet 架构,以在 Keras 中构建和训练 CNN 模型:
- 导入所需的 Keras 模块:
import keras
from keras.models import Sequential
from keras.layers import Conv2D,MaxPooling2D, Dense, Flatten, Reshape
from keras.optimizers import SGD
- 定义每个层的过滤器数量:
n_filters=[32,64]
- 定义其他超参数:
learning_rate = 0.01
n_epochs = 10
batch_size = 100
- 定义顺序模型并添加层以将输入数据重新整形为形状
(n_width,n_height,n_depth)
:
model = Sequential()
model.add(Reshape(target_shape=(n_width,n_height,n_depth),
input_shape=(n_inputs,))
)
- 使用 4 x 4 内核过滤器,
SAME
填充和relu
激活添加第一个卷积层:
model.add(Conv2D(filters=n_filters[0],kernel_size=4,
padding='SAME',activation='relu')
)
- 添加区域大小为 2 x 2 且步长为 2 x 2 的池化层:
model.add(MaxPooling2D(pool_size=(2,2),strides=(2,2)))
- 以与添加第一层相同的方式添加第二个卷积和池化层:
model.add(Conv2D(filters=n_filters[1],kernel_size=4,
padding='SAME',activation='relu')
)
model.add(MaxPooling2D(pool_size=(2,2),strides=(2,2)))
- 添加层以展平第二个层的输出和 1024 个神经元的完全连接层,以处理展平的输出:
model.add(Flatten())
model.add(Dense(units=1024, activation='relu'))
- 使用
softmax
激活添加最终输出层:
model.add(Dense(units=n_outputs, activation='softmax'))
- 使用以下代码查看模型摘要:
model.summary()
该模型描述如下:
Layer (type) Output Shape Param #
=================================================================
reshape_1 (Reshape) (None, 28, 28, 1) 0
_________________________________________________________________
conv2d_1 (Conv2D) (None, 28, 28, 32) 544
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 14, 14, 32) 0
_________________________________________________________________
conv2d_2 (Conv2D) (None, 14, 14, 64) 32832
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 7, 7, 64) 0
_________________________________________________________________
flatten_1 (Flatten) (None, 3136) 0
_________________________________________________________________
dense_1 (Dense) (None, 1024) 3212288
_________________________________________________________________
dense_2 (Dense) (None, 10) 10250
=================================================================
Total params: 3,255,914
Trainable params: 3,255,914
Non-trainable params: 0
_________________________________________________________________
- 编译,训练和评估模型:
model.compile(loss='categorical_crossentropy',
optimizer=SGD(lr=learning_rate),
metrics=['accuracy'])
model.fit(X_train, Y_train,batch_size=batch_size,
epochs=n_epochs)
score = model.evaluate(X_test, Y_test)
print('\nTest loss:', score[0])
print('Test accuracy:', score[1])
我们得到以下输出:
Epoch 1/10
55000/55000 [===================] - 267s - loss: 0.8854 - acc: 0.7631
Epoch 2/10
55000/55000 [===================] - 272s - loss: 0.2406 - acc: 0.9272
Epoch 3/10
55000/55000 [===================] - 267s - loss: 0.1712 - acc: 0.9488
Epoch 4/10
55000/55000 [===================] - 295s - loss: 0.1339 - acc: 0.9604
Epoch 5/10
55000/55000 [===================] - 278s - loss: 0.1112 - acc: 0.9667
Epoch 6/10
55000/55000 [===================] - 279s - loss: 0.0957 - acc: 0.9714
Epoch 7/10
55000/55000 [===================] - 316s - loss: 0.0842 - acc: 0.9744
Epoch 8/10
55000/55000 [===================] - 317s - loss: 0.0758 - acc: 0.9773
Epoch 9/10
55000/55000 [===================] - 285s - loss: 0.0693 - acc: 0.9790
Epoch 10/10
55000/55000 [===================] - 217s - loss: 0.0630 - acc: 0.9804
Test loss: 0.0628845927377
Test accuracy: 0.9785
准确性的差异可归因于我们在这里使用 SGD 优化器这一事实,它没有实现我们用于 TensorFlow 模型的AdamOptimizer
提供的一些高级功能。