我使用TensorFlow来训练DNN。我了解到Batch Normalization对于DNN非常有帮助,因此我在DNN中使用了它。
我使用"tf.layers.batch_normalization"并按照API文档的说明构建网络:在训练时,将其参数"training=True",在验证时,则设置为"training=False"。并且添加tf.get_collection(tf.GraphKeys.UPDATE_OPS)。
以下是我的代码:
为什么会出现这个结果?在训练时需要设置 "training=True",在验证时需要设置 "training=False" 吗?
我使用"tf.layers.batch_normalization"并按照API文档的说明构建网络:在训练时,将其参数"training=True",在验证时,则设置为"training=False"。并且添加tf.get_collection(tf.GraphKeys.UPDATE_OPS)。
以下是我的代码:
# -*- coding: utf-8 -*-
import tensorflow as tf
import numpy as np
input_node_num=257*7
output_node_num=257
tf_X = tf.placeholder(tf.float32,[None,input_node_num])
tf_Y = tf.placeholder(tf.float32,[None,output_node_num])
dropout_rate=tf.placeholder(tf.float32)
flag_training=tf.placeholder(tf.bool)
hid_node_num=2048
h1=tf.contrib.layers.fully_connected(tf_X, hid_node_num, activation_fn=None)
h1_2=tf.nn.relu(tf.layers.batch_normalization(h1,training=flag_training))
h1_3=tf.nn.dropout(h1_2,dropout_rate)
h2=tf.contrib.layers.fully_connected(h1_3, hid_node_num, activation_fn=None)
h2_2=tf.nn.relu(tf.layers.batch_normalization(h2,training=flag_training))
h2_3=tf.nn.dropout(h2_2,dropout_rate)
h3=tf.contrib.layers.fully_connected(h2_3, hid_node_num, activation_fn=None)
h3_2=tf.nn.relu(tf.layers.batch_normalization(h3,training=flag_training))
h3_3=tf.nn.dropout(h3_2,dropout_rate)
tf_Y_pre=tf.contrib.layers.fully_connected(h3_3, output_node_num, activation_fn=None)
loss=tf.reduce_mean(tf.square(tf_Y-tf_Y_pre))
update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS)
with tf.control_dependencies(update_ops):
train_step = tf.train.AdamOptimizer(1e-4).minimize(loss)
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
for i1 in range(3000*num_batch):
train_feature=... # Some processing
train_label=... # Some processing
sess.run(train_step,feed_dict={tf_X:train_feature,tf_Y:train_label,flag_training:True,dropout_rate:1}) # when train , set "training=True" , when validate ,set "training=False" , get a bad result . However when train , set "training=False" ,when validate ,set "training=False" , get a better result .
if((i1+1)%277200==0):# print validate loss every 0.1 epoch
validate_feature=... # Some processing
validate_label=... # Some processing
validate_loss = sess.run(loss,feed_dict={tf_X:validate_feature,tf_Y:validate_label,flag_training:False,dropout_rate:1})
print(validate_loss)
我的代码有错误吗?如果我的代码正确,我认为我得到了一个奇怪的结果:
在训练时,我设置“training = True”,在验证时,设置“training = False”,结果不好。我每0.1个epoch打印一次验证损失,第1到第3个epoch的验证损失为
0.929624
0.992692
0.814033
0.858562
1.042705
0.665418
0.753507
0.700503
0.508338
0.761886
0.787044
0.817034
0.726586
0.901634
0.633383
0.783920
0.528140
0.847496
0.804937
0.828761
0.802314
0.855557
0.702335
0.764318
0.776465
0.719034
0.678497
0.596230
0.739280
0.970555
然而,当我更改代码“sess.run(train_step,feed_dict={tf_X:train_feature,tf_Y:train_label,flag_training:True,dropout_rate:1})”时,即:在训练时设置“training=False”,在验证时设置“training=True”。结果非常好。第一轮的验证损失为
0.474313
0.391002
0.369357
0.366732
0.383477
0.346027
0.336518
0.368153
0.330749
0.322070
0.335551
为什么会出现这个结果?在训练时需要设置 "training=True",在验证时需要设置 "training=False" 吗?