Keras学习笔记

1
2
3
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers

使用Sequential模型

一个Sequential模型适用于简单的层堆叠, 其中每一层正好有一个输入张量和一个输出张量。

1
2
3
4
5
6
7
model = keras.Sequential(
[
layers.Dense(2, activation="relu", name="layer1"),
layers.Dense(3, activation="relu", name="layer2"),
layers.Dense(4, name="layer3"),
]
)
1
2
x = tf.ones((3, 3))
x
<tf.Tensor: shape=(3, 3), dtype=float32, numpy=
array([[1., 1., 1.],
       [1., 1., 1.],
       [1., 1., 1.]], dtype=float32)>
1
y = model(x)
1
y
<tf.Tensor: shape=(3, 4), dtype=float32, numpy=
array([[0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.]], dtype=float32)>

上述代码等效于一下代码:

1
2
3
4
5
6
7
8
# Create 3 layers
layer1 = layers.Dense(2, activation="relu", name="layer1")
layer2 = layers.Dense(3, activation="relu", name="layer2")
layer3 = layers.Dense(4, name="layer3")

# Call layers on a test input
x = tf.ones((3, 3))
y = layer3(layer2(layer1(x)))
1
y
<tf.Tensor: shape=(3, 4), dtype=float32, numpy=
array([[0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.]], dtype=float32)>

Sequential不适用于以下情况:

模型有多个输入或多个输出
任何一层都有多个输入或多个输出
需要进行图层共享
需要非线性拓扑(例如,残余连接,多分支模型)

可通过以下layers属性访问其图层

1
model.layers
[<tensorflow.python.keras.layers.core.Dense at 0x1ee679d3cf8>,
 <tensorflow.python.keras.layers.core.Dense at 0x1ee679f3a90>,
 <tensorflow.python.keras.layers.core.Dense at 0x1ee67a0c208>]

还可以通过以下add()方法逐步创建一个顺序模型:

1
2
3
4
model = keras.Sequential()
model.add(layers.Dense(2, activation="relu"))
model.add(layers.Dense(3, activation="relu"))
model.add(layers.Dense(4))

还有一种相应的pop()方法可以删除图层:顺序模型的行为非常类似于图层列表。

1
print(len(model.layers))
3
1
model.pop()
1
print(len(model.layers))
2

Sequential构造函数接受name参数,就像Keras中的任何层或模型一样。这对于用语义上有意义的名称注释TensorBoard图很有用。

1
2
3
4
model = keras.Sequential(name="my_sequential")
model.add(layers.Dense(2, activation="relu", name="layer1"))
model.add(layers.Dense(3, activation="relu", name="layer2"))
model.add(layers.Dense(4, name="layer3"))

预先指定输入形状

Keras中的所有图层都需要知道其输入的形状,以便能够创建其权重。因此,当创建这样的图层时,最初没有权重:

1
2
layer = layers.Dense(3)
layer.weights
[]

由于权重的形状取决于输入的形状,因此会在首次调用输入时创建其权重:

1
2
3
x = tf.ones((1, 4))
y = layer(x)
layer.weights
[<tf.Variable 'dense_3/kernel:0' shape=(4, 3) dtype=float32, numpy=
 array([[-0.23496091, -0.42415935, -0.38969237],
        [ 0.47878957,  0.6321573 ,  0.53070235],
        [-0.57678986,  0.5862113 , -0.5439472 ],
        [-0.8276289 ,  0.88936853, -0.6267946 ]], dtype=float32)>,
 <tf.Variable 'dense_3/bias:0' shape=(3,) dtype=float32, numpy=array([0., 0., 0.], dtype=float32)>]

这也适用于顺序模型。当实例化没有输入形状的顺序模型时,它不是“构建”的:它没有权重(并且调用 model.weights结果仅说明了这一点)。权重是在模型首次看到一些输入数据时创建的:

1
2
3
4
5
6
7
8
9
10
11
model = keras.Sequential(
[
layers.Dense(2, activation="relu"),
layers.Dense(3, activation="relu"),
layers.Dense(4),
]
)

x = tf.ones((1, 4))
y = model(x)
print("Number of weights after calling the model:", len(model.weights)) # 6
Number of weights after calling the model: 6

一旦“构建”了模型,就可以调用其summary()方法以显示其内容:

1
model.summary()
Model: "sequential_6"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense_4 (Dense)              (1, 2)                    10        
_________________________________________________________________
dense_5 (Dense)              (1, 3)                    9         
_________________________________________________________________
dense_6 (Dense)              (1, 4)                    16        
=================================================================
Total params: 35
Trainable params: 35
Non-trainable params: 0
_________________________________________________________________

但是,当逐步构建顺序模型时,能够显示到目前为止的模型摘要(包括当前输出形状)非常有用。在这种情况下,应该通过将一个Input 对象传递给模型来启动模型,以使它从一开始就知道其输入形状:

1
2
3
4
5
model = keras.Sequential()
model.add(keras.Input(shape=(4,)))
model.add(layers.Dense(2, activation="relu"))

model.summary()
Model: "sequential_8"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense_10 (Dense)             (None, 2)                 10        
=================================================================
Total params: 10
Trainable params: 10
Non-trainable params: 0
_________________________________________________________________

由于该Input对象model.layers不是图层,因此不会显示为的一部分:

1
model.layers
[<tensorflow.python.keras.layers.core.Dense at 0x1ee68ade4e0>]

一个简单的替代方法是将一个input_shape参数传递给第一层:

1
2
3
4
model = keras.Sequential()
model.add(layers.Dense(2, activation="relu", input_shape=(4,)))

model.summary()
Model: "sequential_9"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense_11 (Dense)             (None, 2)                 10        
=================================================================
Total params: 10
Trainable params: 10
Non-trainable params: 0
_________________________________________________________________

使用这样的预定义输入形状构建的模型始终具有权重(甚至在查看任何数据之前),并且始终具有定义的输出形状。

通常,建议的最佳做法是始终事先指定顺序模型的输入形状(如果预先知道它是什么)。

常见的调试工作流程:add()+summary()

在构建新的顺序体系结构时,以渐进方式堆叠层add()并经常打印模型摘要很有用。例如,可以监视堆栈Conv2D和MaxPooling2D图层如何对图像特征贴图进行下采样:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
model = keras.Sequential()
model.add(keras.Input(shape=(250, 250, 3))) # 250x250 RGB images
model.add(layers.Conv2D(32, 5, strides=2, activation="relu"))
model.add(layers.Conv2D(32, 3, activation="relu"))
model.add(layers.MaxPooling2D(3))

model.summary()

# (40, 40, 32)

model.add(layers.Conv2D(32, 3, activation="relu"))
model.add(layers.Conv2D(32, 3, activation="relu"))
model.add(layers.MaxPooling2D(3))
model.add(layers.Conv2D(32, 3, activation="relu"))
model.add(layers.Conv2D(32, 3, activation="relu"))
model.add(layers.MaxPooling2D(2))

model.summary()

# 现在我们有了4x4的特征图,是时候应用MaxPooling了。
model.add(layers.GlobalMaxPooling2D())

# 最后,我们添加一个分类层。
model.add(layers.Dense(10))
Model: "sequential_11"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d_6 (Conv2D)            (None, 123, 123, 32)      2432      
_________________________________________________________________
conv2d_7 (Conv2D)            (None, 121, 121, 32)      9248      
_________________________________________________________________
max_pooling2d_3 (MaxPooling2 (None, 40, 40, 32)        0         
=================================================================
Total params: 11,680
Trainable params: 11,680
Non-trainable params: 0
_________________________________________________________________
Model: "sequential_11"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d_6 (Conv2D)            (None, 123, 123, 32)      2432      
_________________________________________________________________
conv2d_7 (Conv2D)            (None, 121, 121, 32)      9248      
_________________________________________________________________
max_pooling2d_3 (MaxPooling2 (None, 40, 40, 32)        0         
_________________________________________________________________
conv2d_8 (Conv2D)            (None, 38, 38, 32)        9248      
_________________________________________________________________
conv2d_9 (Conv2D)            (None, 36, 36, 32)        9248      
_________________________________________________________________
max_pooling2d_4 (MaxPooling2 (None, 12, 12, 32)        0         
_________________________________________________________________
conv2d_10 (Conv2D)           (None, 10, 10, 32)        9248      
_________________________________________________________________
conv2d_11 (Conv2D)           (None, 8, 8, 32)          9248      
_________________________________________________________________
max_pooling2d_5 (MaxPooling2 (None, 4, 4, 32)          0         
=================================================================
Total params: 48,672
Trainable params: 48,672
Non-trainable params: 0
_________________________________________________________________

拥有模型后该怎么办

一旦模型架构准备就绪,将需要:

训练模型,评估模型并进行推理。
将模型保存到磁盘并还原。
通过利用多个GPU来加速模型训练。

使用顺序模型进行特征提取

一旦建立了顺序模型,它的行为就类似于功能API模型。这意味着每个层都有一个input and output属性。这些属性可以做一些事情,例如快速创建一个模型,以提取顺序模型中所有中间层的输出:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
initial_model = keras.Sequential(
[
keras.Input(shape=(250, 250, 3)),
layers.Conv2D(32, 5, strides=2, activation="relu"),
layers.Conv2D(32, 3, activation="relu"),
layers.Conv2D(32, 3, activation="relu"),
]
)
feature_extractor = keras.Model(
inputs=initial_model.inputs,
outputs=[layer.output for layer in initial_model.layers],
)

# Call feature extractor on test input.
x = tf.ones((1, 250, 250, 3))
features = feature_extractor(x)

这是一个类似的示例,仅从一层中提取要素:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
initial_model = keras.Sequential(
[
keras.Input(shape=(250, 250, 3)),
layers.Conv2D(32, 5, strides=2, activation="relu"),
layers.Conv2D(32, 3, activation="relu", name="my_intermediate_layer"),
layers.Conv2D(32, 3, activation="relu"),
]
)
feature_extractor = keras.Model(
inputs=initial_model.inputs,
outputs=initial_model.get_layer(name="my_intermediate_layer").output,
)
# Call feature extractor on test input.
x = tf.ones((1, 250, 250, 3))
features = feature_extractor(x)
0%