scikit-learn中的accuracy_score和Keras中的accuracy之间的区别

5

我已经在Keras上实现并训练了一个多分类卷积神经网络。测试准确率为0.9522。然而,当我使用scikit-learn中的accuracy_score计算准确率时,得到的结果是0.6224。以下是我的做法:

X_train = X[:60000, :, :, :]
X_test = X[60000:, :, :, :]
y_train = y[:60000, :]
y_test = y[60000:, :]
print ('Size of the arrays:')
print ('X_train: ' + str(X_train.shape))
print ('X_test: ' + str(X_test.shape))
print ('y_train: ' + str(y_train.shape))
print ('y_test: ' + str(y_test.shape))

结果:

Size of the arrays:
X_train: (60000, 64, 64, 3)
X_test: (40000, 64, 64, 3)
y_train: (60000, 14)
y_test: (40000, 14)

适配Keras模型(为了保持代码简洁,我不在此处添加整个模型):
model = Sequential()
model.add(Conv2D(10, (5,5), padding='same', input_shape=(64, 64, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(512))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(14))
model.add(Activation('softmax'))
model.compile(optimizer='rmsprop', loss='mean_squared_error', metrics['accuracy'])
model.fit(X_train, y_train, batch_size=100, epochs=5, verbose=1, validation_data=(X_test, y_test))

使用Scikit-Learn提高准确性:

y_pred = model.predict(X_test, batch_size=100)
y_pred1D = y_pred.argmax(1)
y_pred = model.predict(X_test, batch_size=100)
y_test1D = y_test.argmax(1)
print ('Accuracy on validation data: ' + str(accuracy_score(y_test1D, y_pred1D)))

分数:

Accuracy on validation data: 0.6224

Keras中的准确性:

score_Keras = model.evaluate(X_test, y_test, batch_size=200)
print('Accuracy on validation data with Keras: ' + str(score_Keras[1]))

结果:

Accuracy on validation data with Keras: 0.95219109267

我的问题是:为什么这两个准确率不同,我应该使用哪一个来评估我的多类分类器的性能呢?
提前感谢!
1个回答

2
你的代码中有一个错别字,为什么要定义两次 y_pred
y_pred = model.predict(X_test, batch_size=100)
y_pred1D = y_pred.argmax(1)
y_pred = model.predict(X_test, batch_size=100)
y_test1D = y_test.argmax(1)
print ('Accuracy on validation data: ' + str(accuracy_score(y_test1D, y_pred1D)))

应该是:

y_pred = model.predict(X_test, batch_size=100)
y_pred1D = y_pred.argmax(1)
y_test1D = y_test.argmax(1)
print ('Accuracy on validation data: ' + str(accuracy_score(y_test1D, y_pred1D)))

尽管如此,您仍应提供y_pred1Dy_test1D的值和形状,错误在于执行y_pred1D = y_pred.argmax(1)y_test1D = y_test.argmax(1)以使用scikit learn度量时。我猜想,这不是您认为的那样,否则这两个度量就会相同。

我纠正了拼写错误,现在两种方法的准确性相同。我不明白它是如何改变计算的,但我会将您的答案标记为正确的。非常感谢您。 - Antoine Caté

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接