# 简单线性模型

## 导入

``````%matplotlib inline
import matplotlib.pyplot as plt
import tensorflow as tf
import numpy as np
from sklearn.metrics import confusion_matrix
``````

``````tf.__version__
``````
``````'0.12.0-rc1'
``````

## 载入数据

MNIST数据集大约有12MB，如果给定的地址里没有文件，它将自动下载。

``````from tensorflow.examples.tutorials.mnist import input_data
data = input_data.read_data_sets("data/MNIST/", one_hot=True)
``````
``````Extracting data/MNIST/train-images-idx3-ubyte.gz
Extracting data/MNIST/train-labels-idx1-ubyte.gz
Extracting data/MNIST/t10k-images-idx3-ubyte.gz
Extracting data/MNIST/t10k-labels-idx1-ubyte.gz
``````

``````print("Size of:")
print("- Training-set:\t\t{}".format(len(data.train.labels)))
print("- Test-set:\t\t{}".format(len(data.test.labels)))
print("- Validation-set:\t{}".format(len(data.validation.labels)))
``````
``````Size of:
- Training-set:        55000
- Test-set:        10000
- Validation-set:    5000
``````

### One-Hot 编码

``````data.test.labels[0:5, :]
``````
``````array([[ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  1.,  0.,  0.],
[ 0.,  0.,  1.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
[ 0.,  1.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
[ 1.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
[ 0.,  0.,  0.,  0.,  1.,  0.,  0.,  0.,  0.,  0.]])
``````

``````data.test.cls = np.array([label.argmax() for label in data.test.labels])
``````

``````data.test.cls[0:5]
``````
``````array([7, 2, 1, 0, 4])
``````

### 数据维度

``````# We know that MNIST images are 28 pixels in each dimension.
img_size = 28

# Images are stored in one-dimensional arrays of this length.
img_size_flat = img_size * img_size

# Tuple with height and width of images used to reshape arrays.
img_shape = (img_size, img_size)

# Number of classes, one class for each of 10 digits.
num_classes = 10
``````

### 用来绘制图像的帮助函数

``````def plot_images(images, cls_true, cls_pred=None):
assert len(images) == len(cls_true) == 9

# Create figure with 3x3 sub-plots.
fig, axes = plt.subplots(3, 3)

for i, ax in enumerate(axes.flat):
# Plot image.
ax.imshow(images[i].reshape(img_shape), cmap='binary')

# Show true and predicted classes.
if cls_pred is None:
xlabel = "True: {0}".format(cls_true[i])
else:
xlabel = "True: {0}, Pred: {1}".format(cls_true[i], cls_pred[i])

ax.set_xlabel(xlabel)

# Remove ticks from the plot.
ax.set_xticks([])
ax.set_yticks([])
``````

### 绘制几张图像来看看数据是否正确

``````# Get the first images from the test-set.
images = data.test.images[0:9]

# Get the true classes for those images.
cls_true = data.test.cls[0:9]

# Plot the images and labels using our helper-function above.
plot_images(images=images, cls_true=cls_true)
``````

## TensorFlow图

TensorFlow的全部目的就是使用一个称之为计算图（computational graph）的东西，它会比直接在Python中进行相同计算量要高效得多。TensorFlow比Numpy更高效，因为TensorFlow了解整个需要运行的计算图，然而Numpy只知道某个时间点上唯一的数学运算。

TensorFlow也能够自动地计算需要优化的变量的梯度，使得模型有更好的表现。这是由于Graph是简单数学表达式的结合，因此整个图的梯度可以用链式法则推导出来。

• 占位符变量（Placeholder）用来改变图的输入。
• 模型变量（Model）将会被优化，使得模型表现得更好。
• 模型本质上就是一些数学函数，它根据Placeholder和模型的输入变量来计算一些输出。
• 一个cost度量用来指导变量的优化。
• 一个优化策略会更新模型的变量。

### 占位符 （Placeholder）变量

Placeholder是作为图的输入，每次我们运行图的时候都可能会改变它们。将这个过程称为feeding placeholder变量，后面将会描述它。

``````x = tf.placeholder(tf.float32, [None, img_size_flat])
``````

``````y_true = tf.placeholder(tf.float32, [None, num_classes])
``````

``````y_true_cls = tf.placeholder(tf.int64, [None])
``````

### 需要优化的变量

``````weights = tf.Variable(tf.zeros([img_size_flat, num_classes]))
``````

``````biases = tf.Variable(tf.zeros([num_classes]))
``````

### 模型

``````logits = tf.matmul(x, weights) + biases
``````

``````y_pred = tf.nn.softmax(logits)
``````

``````y_pred_cls = tf.argmax(y_pred, dimension=1)
``````

### 优化损失函数

TensorFlow有一个内置的计算交叉熵的函数。需要注意的是它使用`logits`的值，因为在它内部也计算了softmax。

``````cross_entropy = tf.nn.softmax_cross_entropy_with_logits(logits=logits,
labels=y_true)
``````

``````cost = tf.reduce_mean(cross_entropy)
``````

### 优化方法

``````optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.5).minimize(cost)
``````

### 性能度量

``````correct_prediction = tf.equal(y_pred_cls, y_true_cls)
``````

``````accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
``````

## 运行TensorFlow

### 创建TensorFlow会话（session）

``````session = tf.Session()
``````

### 初始化变量

``````session.run(tf.global_variables_initializer())
``````

### 用来优化迭代的帮助函数

``````batch_size = 100
``````

``````def optimize(num_iterations):
for i in range(num_iterations):
# Get a batch of training examples.
# x_batch now holds a batch of images and
# y_true_batch are the true labels for those images.
x_batch, y_true_batch = data.train.next_batch(batch_size)

# Put the batch into a dict with the proper names
# for placeholder variables in the TensorFlow graph.
# Note that the placeholder for y_true_cls is not set
# because it is not used during training.
feed_dict_train = {x: x_batch,
y_true: y_true_batch}

# Run the optimizer using this batch of training data.
# TensorFlow assigns the variables in feed_dict_train
# to the placeholder variables and then runs the optimizer.
session.run(optimizer, feed_dict=feed_dict_train)
``````

### 展示性能的帮助函数

``````feed_dict_test = {x: data.test.images,
y_true: data.test.labels,
y_true_cls: data.test.cls}
``````

``````def print_accuracy():
# Use TensorFlow to compute the accuracy.
acc = session.run(accuracy, feed_dict=feed_dict_test)

# Print the accuracy.
print("Accuracy on test-set: {0:.1%}".format(acc))
``````

Function for printing and plotting the confusion matrix using scikit-learn.

``````def print_confusion_matrix():
# Get the true classifications for the test-set.
cls_true = data.test.cls

# Get the predicted classifications for the test-set.
cls_pred = session.run(y_pred_cls, feed_dict=feed_dict_test)

# Get the confusion matrix using sklearn.
cm = confusion_matrix(y_true=cls_true,
y_pred=cls_pred)

# Print the confusion matrix as text.
print(cm)

# Plot the confusion matrix as an image.
plt.imshow(cm, interpolation='nearest', cmap=plt.cm.Blues)

# Make various adjustments to the plot.
plt.tight_layout()
plt.colorbar()
tick_marks = np.arange(num_classes)
plt.xticks(tick_marks, range(num_classes))
plt.yticks(tick_marks, range(num_classes))
plt.xlabel('Predicted')
plt.ylabel('True')
``````

``````def plot_example_errors():
# Use TensorFlow to get a list of boolean values
# whether each test-image has been correctly classified,
# and a list for the predicted class of each image.
correct, cls_pred = session.run([correct_prediction, y_pred_cls],
feed_dict=feed_dict_test)

# Negate the boolean array.
incorrect = (correct == False)

# Get the images from the test-set that have been
# incorrectly classified.
images = data.test.images[incorrect]

# Get the predicted classes for those images.
cls_pred = cls_pred[incorrect]

# Get the true classes for those images.
cls_true = data.test.cls[incorrect]

# Plot the first 9 images.
plot_images(images=images[0:9],
cls_true=cls_true[0:9],
cls_pred=cls_pred[0:9])
``````

### 绘制模型权重的帮助函数

``````def plot_weights():
# Get the values for the weights from the TensorFlow variable.
w = session.run(weights)

# Get the lowest and highest values for the weights.
# This is used to correct the colour intensity across
# the images so they can be compared with each other.
w_min = np.min(w)
w_max = np.max(w)

# Create figure with 3x4 sub-plots,
# where the last 2 sub-plots are unused.
fig, axes = plt.subplots(3, 4)

for i, ax in enumerate(axes.flat):
# Only use the weights for the first 10 sub-plots.
if i<10:
# Get the weights for the i'th digit and reshape it.
# Note that w.shape == (img_size_flat, 10)
image = w[:, i].reshape(img_shape)

# Set the label for the sub-plot.
ax.set_xlabel("Weights: {0}".format(i))

# Plot the image.
ax.imshow(image, vmin=w_min, vmax=w_max, cmap='seismic')

# Remove ticks from each sub-plot.
ax.set_xticks([])
ax.set_yticks([])
``````

## 优化之前的性能

``````print_accuracy()
``````
``````Accuracy on test-set: 9.8%
``````
``````plot_example_errors()
``````

## 1次迭代优化后的性能

``````optimize(num_iterations=1)
``````
``````print_accuracy()
``````
``````Accuracy on test-set: 40.7%
``````
``````plot_example_errors()
``````

``````plot_weights()
``````

## 10次优化迭代后的性能

``````# We have already performed 1 iteration.
optimize(num_iterations=9)
``````
``````print_accuracy()
``````
``````Accuracy on test-set: 78.2%
``````
``````plot_example_errors()
``````

``````plot_weights()
``````

## 1000次迭代之后的性能

``````# We have already performed 10 iterations.
optimize(num_iterations=990)
``````
``````print_accuracy()
``````
``````Accuracy on test-set: 91.7%
``````
``````plot_example_errors()
``````

``````plot_weights()
``````

``````print_confusion_matrix()
``````
``````[[ 957    0    3    2    0    5   11    1    1    0]
[   0 1108    2    2    1    2    4    2   14    0]
[   4    9  914   19   15    5   13   14   35    4]
[   1    0   16  928    0   28    2   14   13    8]
[   1    1    3    2  939    0   10    2    6   18]
[  10    3    3   33   10  784   17    6   19    7]
[   8    3    3    2   11   14  915    1    1    0]
[   3    9   21    9    7    1    0  959    2   17]
[   8    8    8   38   11   40   14   18  825    4]
[  11    7    1   13   75   13    1   39    4  845]]
``````

``````# This has been commented out in case you want to modify and experiment
# with the Notebook without having to restart it.
# session.close()
``````

## 练习

These are a few suggestions for exercises that may help improve your skills with TensorFlow. It is important to get hands-on experience with TensorFlow in order to learn how to use it properly.

• 改变优化器的学习率。
• 改变优化器，比如用`AdagradOptimizer``AdamOptimizer`
• 将batch-size改为1或1000。
• 这些改变如何影响性能？
• 你觉得这些改变对其他分类问题或数学模型有相同的影响吗?
• 如果你不改变任何参数，多次运行Notebook，会得到完成一样的结果吗？为什么？
• 改变`plot_example_errors()` 函数，使它打印误分类的 `logits``y_pred`值。
• `sparse_softmax_cross_entropy_with_logits` 代替 `softmax_cross_entropy_with_logits`。这可能需要改变代码的多个地方。探讨使用这两中方法的优缺点。
• 不看源码，自己重写程序。
• 向朋友解释程序如何工作。