实现单元测试

测试代码可以加快原型设计速度,提高调试效率,加快更改速度,并且可以更轻松地共享代码。在 TensorFlow 中有许多简单的方法可以实现单元测试,我们将在本文中介绍它们。

做好准备

在编写 TensorFlow 模型时,有助于进行单元测试以检查程序的功能。这有助于我们,因为当我们想要对程序单元进行更改时,测试将确保这些更改不会以未知方式破坏模型。在这个秘籍中,我们将创建一个依赖于MNIST数据的简单 CNN 网络。有了它,我们将实现三种不同类型的单元测试来说明如何在 TensorFlow 中编写它们。

请注意,Python 有一个很棒的测试库,名为 Nose。 TensorFlow 还具有内置测试功能,我们将在其中查看,这样可以更轻松地测试 Tensor 对象的值,而无需评估会话中的值。

  1. 首先,我们需要加载必要的库并格式化数据,如下所示:
import sys
import numpy as np 
import tensorflow as tf 
from tensorflow.python.framework import ops 
ops.reset_default_graph() 
# Start a graph session 
sess = tf.Session() 
# Load data 
data_dir = 'temp' 
mnist = tf.keras.datasets.mnist
(train_xdata, train_labels), (test_xdata, test_labels) = mnist.load_data()
train_xdata = train_xdata / 255.0
test_xdata = test_xdata / 255.0
# Set model parameters 
batch_size = 100 
learning_rate = 0.005 
evaluation_size = 100 
image_width = train_xdata[0].shape[0] 
image_height = train_xdata[0].shape[1] 
target_size = max(train_labels) + 1 
num_channels = 1 # greyscale = 1 channel 
generations = 100 
eval_every = 5 
conv1_features = 25 
conv2_features = 50 
max_pool_size1 = 2 # NxN window for 1st max pool layer 
max_pool_size2 = 2 # NxN window for 2nd max pool layer 
fully_connected_size1 = 100 
dropout_prob = 0.75
  1. 然后,我们需要声明我们的占位符,变量和模型公式,如下所示:
# Declare model placeholders 
x_input_shape = (batch_size, image_width, image_height, num_channels) 
x_input = tf.placeholder(tf.float32, shape=x_input_shape) 
y_target = tf.placeholder(tf.int32, shape=(batch_size)) 
eval_input_shape = (evaluation_size, image_width, image_height, num_channels) 
eval_input = tf.placeholder(tf.float32, shape=eval_input_shape) 
eval_target = tf.placeholder(tf.int32, shape=(evaluation_size)) 
dropout = tf.placeholder(tf.float32, shape=()) 
# Declare model parameters 
conv1_weight = tf.Variable(tf.truncated_normal([4, 4, num_channels, conv1_features], 
                                              stddev=0.1, dtype=tf.float32)) 
conv1_bias = tf.Variable(tf.zeros([conv1_features], dtype=tf.float32)) 
conv2_weight = tf.Variable(tf.truncated_normal([4, 4, conv1_features, conv2_features], 
                                               stddev=0.1, dtype=tf.float32)) 
conv2_bias = tf.Variable(tf.zeros([conv2_features], dtype=tf.float32)) 
# fully connected variables 
resulting_width = image_width // (max_pool_size1 * max_pool_size2) 
resulting_height = image_height // (max_pool_size1 * max_pool_size2) 
full1_input_size = resulting_width * resulting_height * conv2_features 
full1_weight = tf.Variable(tf.truncated_normal([full1_input_size, fully_connected_size1], 
                          stddev=0.1, dtype=tf.float32)) 
full1_bias = tf.Variable(tf.truncated_normal([fully_connected_size1], stddev=0.1, dtype=tf.float32)) 
full2_weight = tf.Variable(tf.truncated_normal([fully_connected_size1, target_size], 
                                               stddev=0.1, dtype=tf.float32)) 
full2_bias = tf.Variable(tf.truncated_normal([target_size], stddev=0.1, dtype=tf.float32)) 

# Initialize Model Operations 
def my_conv_net(input_data): 
    # First Conv-ReLU-MaxPool Layer 
    conv1 = tf.nn.conv2d(input_data, conv1_weight, strides=[1, 1, 1, 1], padding='SAME') 
    relu1 = tf.nn.relu(tf.nn.bias_add(conv1, conv1_bias)) 
    max_pool1 = tf.nn.max_pool(relu1, ksize=[1, max_pool_size1, max_pool_size1, 1], 
                               strides=[1, max_pool_size1, max_pool_size1, 1], padding='SAME') 
    # Second Conv-ReLU-MaxPool Layer 
    conv2 = tf.nn.conv2d(max_pool1, conv2_weight, strides=[1, 1, 1, 1], padding='SAME') 
    relu2 = tf.nn.relu(tf.nn.bias_add(conv2, conv2_bias)) 
    max_pool2 = tf.nn.max_pool(relu2, ksize=[1, max_pool_size2, max_pool_size2, 1], 
                               strides=[1, max_pool_size2, max_pool_size2, 1], padding='SAME') 
    # Transform Output into a 1xN layer for next fully connected layer 
    final_conv_shape = max_pool2.get_shape().as_list() 
    final_shape = final_conv_shape[1] * final_conv_shape[2] * final_conv_shape[3] 
    flat_output = tf.reshape(max_pool2, [final_conv_shape[0], final_shape]) 
    # First Fully Connected Layer 
    fully_connected1 = tf.nn.relu(tf.add(tf.matmul(flat_output, full1_weight), full1_bias)) 
    # Second Fully Connected Layer 
    final_model_output = tf.add(tf.matmul(fully_connected1, full2_weight), full2_bias) 

    # Add dropout 
    final_model_output = tf.nn.dropout(final_model_output, dropout) 
    return final_model_output 

model_output = my_conv_net(x_input) 
test_model_output = my_conv_net(eval_input)
  1. 接下来,我们创建我们的损失函数以及我们的预测和精确操作。然后,我们初始化以下模型变量:
# Declare Loss Function (softmax cross entropy) 
loss = tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits(model_output, y_target)) 
# Create a prediction function 
prediction = tf.nn.softmax(model_output) 
test_prediction = tf.nn.softmax(test_model_output) 

# Create accuracy function 
def get_accuracy(logits, targets): 
    batch_predictions = np.argmax(logits, axis=1) 
    num_correct = np.sum(np.equal(batch_predictions, targets)) 
    return 100\. * num_correct/batch_predictions.shape[0] 

# Create an optimizer 
my_optimizer = tf.train.MomentumOptimizer(learning_rate, 0.9) 
train_step = my_optimizer.minimize(loss) 
# Initialize Variables 
init = tf.global_variables_initializer() 
sess.run(init)
  1. 对于我们的第一个单元测试,我们使用类tf.test.TestCase并创建一种方法来测试占位符(或变量)的值。对于此测试用例,我们确保损失概率(用于保持)大于0.25,因此模型不会更改为尝试训练超过 75%的损失,如下所示:
# Check values of tensors! 
class DropOutTest(tf.test.TestCase): 
    # Make sure that we don't drop too much 
    def dropout_greaterthan(self): 
        with self.test_session(): 
          self.assertGreater(dropout.eval(), 0.25)
  1. 接下来,我们需要测试我们的accuracy函数是否按预期运行。为此,我们创建一个概率样本数组和我们期望的样本,然后确保测试精度返回 100%,如下所示:
# Test accuracy function 
class AccuracyTest(tf.test.TestCase): 
    # Make sure accuracy function behaves correctly 
    def accuracy_exact_test(self): 
        with self.test_session(): 
            test_preds = [[0.9, 0.1],[0.01, 0.99]] 
            test_targets = [0, 1] 
            test_acc = get_accuracy(test_preds, test_targets) 
            self.assertEqual(test_acc.eval(), 100.)
  1. 我们还可以确保 Tensor 对象是我们期望的形状。要通过target_size测试模型输出是batch_size的预期形状,请输入以下代码:
# Test tensorshape 
class ShapeTest(tf.test.TestCase): 
    # Make sure our model output is size [batch_size, num_classes] 
    def output_shape_test(self): 
        with self.test_session(): 
            numpy_array = np.ones([batch_size, target_size]) 
            self.assertShapeEqual(numpy_array, model_output)
  1. 现在我们需要在脚本中使用main()函数告诉 TensorFlow 我们正在运行哪个应用。脚本如下:
def main(argv):
    # Start training loop
    train_loss = []
    train_acc = []
    test_acc = []
    for i in range(generations):
        rand_index = np.random.choice(len(train_xdata), size=batch_size)
        rand_x = train_xdata[rand_index]
        rand_x = np.expand_dims(rand_x, 3)
        rand_y = train_labels[rand_index]
        train_dict = {x_input: rand_x, y_target: rand_y, dropout: dropout_prob}

        sess.run(train_step, feed_dict=train_dict)
        temp_train_loss, temp_train_preds = sess.run([loss, prediction], feed_dict=train_dict)
        temp_train_acc = get_accuracy(temp_train_preds, rand_y)

        if (i + 1) % eval_every == 0:
            eval_index = np.random.choice(len(test_xdata), size=evaluation_size)
            eval_x = test_xdata[eval_index]
            eval_x = np.expand_dims(eval_x, 3)
            eval_y = test_labels[eval_index]
            test_dict = {eval_input: eval_x, eval_target: eval_y, dropout: 1.0}
            test_preds = sess.run(test_prediction, feed_dict=test_dict)
            temp_test_acc = get_accuracy(test_preds, eval_y)

            # Record and print results
            train_loss.append(temp_train_loss)
            train_acc.append(temp_train_acc)
            test_acc.append(temp_test_acc)
            acc_and_loss = [(i + 1), temp_train_loss, temp_train_acc, temp_test_acc]
            acc_and_loss = [np.round(x, 2) for x in acc_and_loss]
            print('Generation # {}. Train Loss: {:.2f}. Train Acc (Test Acc): {:.2f} 
                   ({:.2f})'.format(*acc_and_loss))
  1. 要让我们的脚本执行测试或训练,我们需要以不同的方式从命令行调用它。以下代码段是主程序代码。如果程序收到参数test,它将执行测试;否则,它将运行训练:
if __name__ == '__main__':
    cmd_args = sys.argv
    if len(cmd_args) > 1 and cmd_args[1] == 'test':
        # Perform unit-tests
        tf.test.main(argv=cmd_args[1:])
    else:
        # Run the TensorFlow app
        tf.app.run(main=None, argv=cmd_args)
  1. 如果我们在命令行上运行程序,我们应该得到以下输出:
$ python3 implementing_unit_tests.py test
...
----------------------------------------------------------------------
Ran 3 tests in 0.001s

OK

前面步骤中描述的完整程序可以在 h ttps://github.com/nfmcclure/tensorflow_cookbook/ 的书籍 GitHub 仓库和 Packt 仓库中找到: https://github.com/PacktPublishing/TensorFlow-Machine-Learning-Cookbook-Second-Edition

工作原理

在本节中,我们实现了三种类型的单元测试:张量值,操作输出和张量形状。 TensorFlow 有更多类型的单元测试函数,可在此处找到: https://www.tensorflow.org/versions/master/api_docs/python/test.html

请记住,单元测试有助于确保代码能够按预期运行,为共享代码提供信心,并使再现性更易于访问。

results matching ""

    No results matching ""