准确率下降的卷积神经网络

3

当训练样本数量很大(100,000)时,我的卷积神经网络的准确性反而下降。当训练样本数量较小(6,000)时,准确性会增加到一定程度,然后开始下降。

示例:

nr_training_examples 100000
tb 2500
epoch 0  loss 0.19646 acc 18.52
nr_test_examples 5000
Accuract test set 0.00
nr_training_examples 100000
tb 2500
epoch 1  loss 0.20000 acc 0.00
nr_test_examples 5000
Accuract test set 0.00
nr_training_examples 100000
tb 2500

我应该怎么做才能正确呢?

我正在使用面部照片作为训练样本(70 x 70像素)。

这个网络的灵感来自VGG模型:

2 x cov-3
max_pooling
2 x conv-3
max_pooling
2 X conv-3
1 X conv-1
max_pooling
2 X conv-3
1 X conv-1
max_pooling
fully_connected 1024
fully_connected 1024 - output 128

以下是模型:

def siamese_convnet(x):
    global keep_rate
    #reshape input

    w_conv1_1 = tf.get_variable(name='w_conv1_1', initializer=tf.random_normal([3, 3, 1, 64]))
    w_conv1_2 = tf.get_variable(name='w_conv1_2', initializer=tf.random_normal([3, 3, 64, 64]))

    w_conv2_1 = tf.get_variable(name='w_conv2_1', initializer=tf.random_normal([3, 3, 64, 128]))
    w_conv2_2 = tf.get_variable(name='w_conv2_2', initializer=tf.random_normal([3, 3, 128, 128]))

    w_conv3_1 = tf.get_variable(name='w_conv3_1', initializer=tf.random_normal([3, 3, 128, 256]))
    w_conv3_2 = tf.get_variable(name='w_conv3_2', initializer=tf.random_normal([3, 3, 256, 256]))
    w_conv3_3 = tf.get_variable(name='w_conv3_3', initializer=tf.random_normal([1, 1, 256, 256]))

    w_conv4_1 = tf.get_variable(name='w_conv4_1', initializer=tf.random_normal([3, 3, 256, 512]))
    w_conv4_2 = tf.get_variable(name='w_conv4_2', initializer=tf.random_normal([3, 3, 512, 512]))
    w_conv4_3 = tf.get_variable(name='w_conv4_3', initializer=tf.random_normal([1, 1, 512, 512]))

    w_conv5_1 = tf.get_variable(name='w_conv5_1', initializer=tf.random_normal([3, 3, 512, 512]))
    w_conv5_2 = tf.get_variable(name='w_conv5_2', initializer=tf.random_normal([3, 3, 512, 512]))
    w_conv5_3 = tf.get_variable(name='w_conv5_3', initializer=tf.random_normal([1, 1, 512, 512]))

    w_fc_1 = tf.get_variable(name='fc_1', initializer=tf.random_normal([2*2*512, 1024]))
    w_fc_2 = tf.get_variable(name='fc_2', initializer=tf.random_normal([1024, 1024]))

    fc_layer = tf.get_variable(name='fc_layer', initializer=tf.random_normal([1024, 1024]))
    w_out = tf.get_variable(name='w_out', initializer=tf.random_normal([1024, 128]))

    bias_conv1_1 = tf.get_variable(name='bias_conv1_1', initializer=tf.random_normal([64]))
    bias_conv1_2 = tf.get_variable(name='bias_conv1_2', initializer=tf.random_normal([64]))

    bias_conv2_1 = tf.get_variable(name='bias_conv2_1', initializer=tf.random_normal([128]))
    bias_conv2_2 = tf.get_variable(name='bias_conv2_2', initializer=tf.random_normal([128]))

    bias_conv3_1 = tf.get_variable(name='bias_conv3_1', initializer=tf.random_normal([256]))
    bias_conv3_2 = tf.get_variable(name='bias_conv3_2', initializer=tf.random_normal([256]))
    bias_conv3_3 = tf.get_variable(name='bias_conv3_3', initializer=tf.random_normal([256]))

    bias_conv4_1 = tf.get_variable(name='bias_conv4_1', initializer=tf.random_normal([512]))
    bias_conv4_2 = tf.get_variable(name='bias_conv4_2', initializer=tf.random_normal([512]))
    bias_conv4_3 = tf.get_variable(name='bias_conv4_3', initializer=tf.random_normal([512]))

    bias_conv5_1 = tf.get_variable(name='bias_conv5_1', initializer=tf.random_normal([512]))
    bias_conv5_2 = tf.get_variable(name='bias_conv5_2', initializer=tf.random_normal([512]))
    bias_conv5_3 = tf.get_variable(name='bias_conv5_3', initializer=tf.random_normal([512]))

    bias_fc_1 = tf.get_variable(name='bias_fc_1', initializer=tf.random_normal([1024]))
    bias_fc_2 = tf.get_variable(name='bias_fc_2', initializer=tf.random_normal([1024]))

    bias_fc = tf.get_variable(name='bias_fc', initializer=tf.random_normal([1024]))
    out = tf.get_variable(name='out', initializer=tf.random_normal([128]))

    x = tf.reshape(x , [-1, 70, 70, 1]);

    conv1_1 = tf.nn.relu(conv2d(x, w_conv1_1) + bias_conv1_1);
    conv1_2= tf.nn.relu(conv2d(conv1_1, w_conv1_2) + bias_conv1_2);

    max_pool1 = max_pool(conv1_2);

    conv2_1 = tf.nn.relu( conv2d(max_pool1, w_conv2_1) + bias_conv2_1 );
    conv2_2 = tf.nn.relu( conv2d(conv2_1, w_conv2_2) + bias_conv2_2 );

    max_pool2 = max_pool(conv2_2)

    conv3_1 = tf.nn.relu( conv2d(max_pool2, w_conv3_1) + bias_conv3_1 );
    conv3_2 = tf.nn.relu( conv2d(conv3_1, w_conv3_2) + bias_conv3_2 );
    conv3_3 = tf.nn.relu( conv2d(conv3_2, w_conv3_3) + bias_conv3_3 );

    max_pool3 = max_pool(conv3_3)

    conv4_1 = tf.nn.relu( conv2d(max_pool3, w_conv4_1) + bias_conv4_1 );
    conv4_2 = tf.nn.relu( conv2d(conv4_1, w_conv4_2) + bias_conv4_2 );
    conv4_3 = tf.nn.relu( conv2d(conv4_2, w_conv4_3) + bias_conv4_3 );

    max_pool4 = max_pool(conv4_3)

    conv5_1 = tf.nn.relu( conv2d(max_pool4, w_conv5_1) + bias_conv5_1 );
    conv5_2 = tf.nn.relu( conv2d(conv5_1, w_conv5_2) + bias_conv5_2 );
    conv5_3 = tf.nn.relu( conv2d(conv5_2, w_conv5_3) + bias_conv5_3 );

    max_pool5 = max_pool(conv5_3)

    fc_helper = tf.reshape(max_pool4, [-1, 2*2*512]);
    fc_1 = tf.nn.relu( tf.matmul(fc_helper, w_fc_1) + bias_fc_1 );
    #fc_2 = tf.nn.relu( tf.matmul(fc_1, w_fc_2) + bias_fc_1 );

    fc = tf.nn.relu( tf.matmul(fc_1, fc_layer) + bias_fc );

    output = tf.matmul(fc, w_out) + out

    output = tf.nn.l2_normalize(output, 0)

    return output
2个回答

3
精确度会增加直到某个点,然后开始下降。这是你的神经网络出现过拟合的迹象。如果你还有疑问,请尝试检查成本函数结果,如果它在某些点上升,我敢肯定它是过拟合的。

解决过拟合问题有很多常见的方法:

  • 增加训练数据量
  • 向网络中添加Dropout功能(在训练时随机关闭神经元)
  • 添加正则化(权重衰减)

你可以在这里了解以上解决方案的详细信息:

http://neuralnetworksanddeeplearning.com/chap3.html#overfitting_and_regularization


1
您的网络可能存在过拟合问题。尝试在全连接层中添加dropout(保留概率约为0.5)来解决此问题。

当你提到准确性时,是指训练准确性还是验证准确性?如果是验证准确性,那么很可能你的网络正在过度训练。尝试其他评论中提到的方法。此外,在多个时期内降低学习率也可能有所帮助。 - Gerry P

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接