如何在Tensorflow中正确使用tf.layers.batch_normalization()？

Question

如何在Tensorflow中正确使用tf.layers.batch_normalization()？

10

我很困惑tensorflow中的tf.layers.batch_normalization。

我的代码如下：

def my_net(x, num_classes, phase_train, scope):
    x = tf.layers.conv2d(...)
    x = tf.layers.batch_normalization(x, training=phase_train)
    x = tf.nn.relu(x) 
    x = tf.layers.max_pooling2d(...)

    # some other staffs
    ...

    # return 
    return x

def train():
    phase_train = tf.placeholder(tf.bool, name='phase_train')
    image_node = tf.placeholder(tf.float32, shape=[batch_size, HEIGHT, WIDTH, 3])
    images, labels = data_loader(train_set)
    val_images, val_labels = data_loader(validation_set)
    prediction_op = my_net(image_node, num_classes=2,phase_train=phase_train, scope='Branch1')

    loss_op = loss(...)
    # some other staffs
    optimizer = tf.train.AdamOptimizer(base_learning_rate)
    update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS)
    with tf.control_dependencies(update_ops):
        train_op = optimizer.minimize(loss=total_loss, global_step=global_step)
    sess = ...
    coord = ...
    while not coord.should_stop():
        image_batch, label_batch = sess.run([images, labels])
        _,loss_value= sess.run([train_op,loss_op], feed_dict={image_node:image_batch,label_node:label_batch,phase_train:True})

        step = step+1

        if step==NUM_TRAIN_SAMPLES:
            for _ in range(NUM_VAL_SAMPLES/batch_size):
                image_batch, label_batch = sess.run([val_images, val_labels])
                prediction_batch = sess.run([prediction_op], feed_dict={image_node:image_batch,label_node:label_batch,phase_train:False})
            val_accuracy = compute_accuracy(...)


def test():
    phase_train = tf.placeholder(tf.bool, name='phase_train')
    image_node = tf.placeholder(tf.float32, shape=[batch_size, HEIGHT, WIDTH, 3])
    test_images, test_labels = data_loader(test_set)
    prediction_op = my_net(image_node, num_classes=2,phase_train=phase_train, scope='Branch1')

    # some staff to load the trained weights to the graph
    saver.restore(...)

    for _ in range(NUM_TEST_SAMPLES/batch_size):
        image_batch, label_batch = sess.run([test_images, test_labels])
        prediction_batch = sess.run([prediction_op], feed_dict={image_node:image_batch,label_node:label_batch,phase_train:False})
    test_accuracy = compute_accuracy(...)

训练似乎效果不错，val_accuracy合理（比如0.70）。问题是：当我尝试使用训练好的模型进行测试（即test函数）时，如果将phase_train设置为False，则test_accuracy非常低（比如0.000270），但是当将phase_train设置为True时，test_accuracy似乎正确（比如0.69）。

据我理解，在测试阶段，phase_train应该是False，对吗？我不确定问题出在哪里。我是不是误解了批量归一化？

- mining

相关：tf.layers.batch_normalization 大测试误差 - Ivan Aksamentov - Drop

嗨@Drop，感谢您的评论。是的，我已经在“train”函数中添加了“update_ops”的依赖项。但错误仍然存在。 - mining

把training=False设置成正确的。问题可能不在批量归一化上。您确定正在正确加载模型检查点吗？ - kww

嗨，@KathyWu，谢谢你的评论。是的，我认为加载是正确的。因为我也尝试了没有BN的模型。模型被正确加载并且预测是合理的。对于tf.layers.batch_normalization层，它有两个参数：beta和gamma。当使用BN时，我也加载了scopt/batch_normalization_1/beta:0和scope/batch_normalization_1/gamma:0。问题是当我将phase_train设置为True时，在测试阶段的预测是合理的。但通常情况下，phase_train应该是False。 - mining

@mining 在我添加了 ... with tf.control_dependencies(update_ops): ... 后，当测试阶段时，phase_train = False 正常工作。 - William

嗨，@Tom，非常感谢你的反馈！ - mining

1个回答

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Matěj Račinský · Answer 1

这可能是您代码中的某些错误，或者只是过度拟合。如果您在训练数据上进行评估，准确率是否与训练期间一样高？如果问题出在批量归一化，则在非训练模式下，训练误差会比训练模式下更高。如果问题是过度拟合，则批量归一化可能并不是原因，根本原因可能在其他地方。