使用Keras计算Matthews相关系数

Question

使用Keras计算Matthews相关系数

9

我有一个在Python 3中的Keras模型（Sequential）：

class LossHistory(keras.callbacks.Callback):
    def on_train_begin(self, logs={}):
        self.matthews_correlation = []

    def on_epoch_end(self, batch, logs={}):
        self.matthews_correlation.append(logs.get('matthews_correlation'))
...    
model.compile(loss='mean_squared_error', optimizer='adam', metrics=['matthews_correlation'])
history = LossHistory()
model.fit(Xtrain, Ytrain, nb_epoch=10, batch_size=10, callbacks=[history])
scores = model.evaluate(Xtest, Ytest, verbose=1)

...
MCC = matthews_correlation(Ytest, predictions)

model.fit()方法打印出 - 据说是根据metrics = ['matthews_correlation']部分 - 进展和Matthews Correlation Coefficient（MCC）. 但它们与最终的MCC相差很大。最终的MCC函数给出了预测的整体MCC，并与sklearn的MCC函数一致（即我相信这个值）。

1) model.evaluate()的得分是什么？它们与最终的MCC或每个epoch的MCC完全不同。

2) 每个epoch的MCC是什么？看起来像这样:

Epoch 1/10 580/580 [===========] - 0s - loss: 0.2500 - matthews_correlation: -0.5817

它们是如何计算的，为什么它们与最终的MCC差别如此之大？

3) 我能否在on_epoch_train()函数中添加matthews_correlation()函数？然后我可以独立计算并打印MCC结果。我不知道Keras隐含地做了什么。

感谢您的帮助。

编辑：这里是一个记录损失历史记录的示例。如果我打印（history.matthews_correlation），我会得到与进度报告给我的相同MCC的列表。

- ste

1个回答

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Matt07 · Accepted Answer

你的MCC为负的原因可能是Keras实现中最近修复的错误导致。查看这个问题。

解决问题的方法可能是从GitHub主分支重新安装Keras，或者编写自己的回调（如此处所述），并在问题中进行修复：

import keras.backend as K
def matthews_correlation(y_true, y_pred):
    y_pred_pos = K.round(K.clip(y_pred, 0, 1))
    y_pred_neg = 1 - y_pred_pos

    y_pos = K.round(K.clip(y_true, 0, 1))
    y_neg = 1 - y_pos

    tp = K.sum(y_pos * y_pred_pos)
    tn = K.sum(y_neg * y_pred_neg)

    fp = K.sum(y_neg * y_pred_pos)
    fn = K.sum(y_pos * y_pred_neg)

    numerator = (tp * tn - fp * fn)
    denominator = K.sqrt((tp + fp) * (tp + fn) * (tn + fp) * (tn + fn))

    return numerator / (denominator + K.epsilon())