在自定义回调中访问验证数据

29

我正在拟合一个train_generator,并通过自定义回调函数来计算validation_generator上的自定义指标。 如何在自定义回调中访问params validation_stepsvalidation_data? 它们不在self.params中,在self.model中也找不到。这是我想要做的事情,欢迎其他不同的方法。

model.fit_generator(generator=train_generator,
                    steps_per_epoch=steps_per_epoch,
                    epochs=epochs,
                    validation_data=validation_generator,
                    validation_steps=validation_steps,
                    callbacks=[CustomMetrics()])


class CustomMetrics(keras.callbacks.Callback):

    def on_epoch_end(self, batch, logs={}):        
        for i in validation_steps:
             # features, labels = next(validation_data)
             # compute custom metric: f(features, labels) 
        return

keras: 2.1.1

更新

我成功将验证数据传递给自定义回调函数的构造函数。然而,这会导致一个令人讨厌的 "内核似乎已经死亡。它将自动重新启动 "的消息。我怀疑这是否是正确的方法。有什么建议吗?

class CustomMetrics(keras.callbacks.Callback):

    def __init__(self, validation_generator, validation_steps):
        self.validation_generator = validation_generator
        self.validation_steps = validation_steps


    def on_epoch_end(self, batch, logs={}):

        self.scores = {
            'recall_score': [],
            'precision_score': [],
            'f1_score': []
        }

        for batch_index in range(self.validation_steps):
            features, y_true = next(self.validation_generator)            
            y_pred = np.asarray(self.model.predict(features))
            y_pred = y_pred.round().astype(int) 
            self.scores['recall_score'].append(recall_score(y_true[:,0], y_pred[:,0]))
            self.scores['precision_score'].append(precision_score(y_true[:,0], y_pred[:,0]))
            self.scores['f1_score'].append(f1_score(y_true[:,0], y_pred[:,0]))
        return

metrics = CustomMetrics(validation_generator, validation_steps)

model.fit_generator(generator=train_generator,
                    steps_per_epoch=steps_per_epoch,
                    epochs=epochs,
                    validation_data=validation_generator,
                    validation_steps=validation_steps,
                    shuffle=True,
                    callbacks=[metrics],
                    verbose=1)

我认为没有很好的替代方案。如果你看一下Keras中的_fit_loop代码,你会发现它并没有真正地将validation_steps和validation_data传递给回调函数。 - sumitgouthaman
使用next(validation_generator)在(批处理开始时)上,这样会比你的方法更好吗?我的意思是,在这种情况下,我不知道next(val_generator)是否会取下一个迭代,或者它总是从开头随机开始,并且永远不会覆盖所有验证数据。 - W. Sam
如果您查看Keras TensorBoard回调函数,似乎有一种从模型获取验证数据的方法,但我找不到它在代码中的位置:https://github.com/tensorflow/tensorflow/blob/r1.14/tensorflow/python/keras/callbacks_v1.py - markemus
我在这里提供了一个可能的答案:https://dev59.com/vlYN5IYBdhLWcg3wq5so#59697739 - bers
大家好...感谢对这个问题的回复。已经有一段时间了,今天我无法再复制上面的问题,但很高兴在这里看到热烈的讨论。 - w00dy
4个回答

8
您可以直接遍历self.validation_data以在每个epoch结束时聚合所有验证数据。如果您想在完整的验证数据集上计算精确度、召回率和F1值:
# Validation metrics callback: validation precision, recall and F1
# Some of the code was adapted from https://medium.com/@thongonary/how-to-compute-f1-score-for-each-epoch-in-keras-a1acd17715a2
class Metrics(callbacks.Callback):

    def on_train_begin(self, logs={}):
        self.val_f1s = []
        self.val_recalls = []
        self.val_precisions = []

    def on_epoch_end(self, epoch, logs):
        # 5.4.1 For each validation batch
        for batch_index in range(0, len(self.validation_data)):
            # 5.4.1.1 Get the batch target values
            temp_targ = self.validation_data[batch_index][1]
            # 5.4.1.2 Get the batch prediction values
            temp_predict = (np.asarray(self.model.predict(
                                self.validation_data[batch_index][0]))).round()
            # 5.4.1.3 Append them to the corresponding output objects
            if(batch_index == 0):
                val_targ = temp_targ
                val_predict = temp_predict
            else:
                val_targ = np.vstack((val_targ, temp_targ))
                val_predict = np.vstack((val_predict, temp_predict))

        val_f1 = round(f1_score(val_targ, val_predict), 4)
        val_recall = round(recall_score(val_targ, val_predict), 4)
        val_precis = round(precision_score(val_targ, val_predict), 4)

        self.val_f1s.append(val_f1)
        self.val_recalls.append(val_recall)
        self.val_precisions.append(val_precis)

        # Add custom metrics to the logs, so that we can use them with
        # EarlyStop and csvLogger callbacks
        logs["val_f1"] = val_f1
        logs["val_recall"] = val_recall
        logs["val_precis"] = val_precis

        print("— val_f1: {} — val_precis: {} — val_recall {}".format(
                 val_f1, val_precis, val_recall))
        return

valid_metrics = Metrics()

然后您可以将valid_metrics添加到回调参数中:

your_model.fit_generator(..., callbacks = [valid_metrics])

确保将其放在回调函数的开头,以防您希望其他回调使用这些措施。


4
可以使用验证数据的预测结果,而不必重新计算吗? - Eduardo Pignatelli
3
def on_epoch_end(self, batch, logs)中访问self.validation的先决条件是什么?我总是遇到一个“AttributeError: 'Metrics' object has no attribute 'validation_data'”错误。 - vanessaxenia
1
@vanessaxenia 你需要将validation_data作为参数传递给Metrics类进行验证。 - Timbus Calin
1
你的batch_index实际上是数据的直接索引,因此它每次只产生一个训练样本。你需要进行切片以获取完整的批处理。而且更为关键的是,self.validation_data只是一个由4个元素组成的列表,这个答案根本不起作用。 - information_interchange

1
我在寻找与同样问题相关的解决方案时,发现了您和另一个解决方案在被接受的答案这里中。如果第二个解决方案可行,那么它比在“在epoch结束时迭代所有验证”更好。
该想法是将目标和预测占位符保存在变量中,并通过自定义回调函数在“在批次结束时”更新这些变量。

1
这里是方法:

from sklearn.metrics import r2_score

class MetricsCallback(keras.callbacks.Callback):
    def on_epoch_end(self, epoch, logs=None):
        if epoch:
            print(self.validation_data[0])
            x_test = self.validation_data[0]
            y_test = self.validation_data[1]
            predictions = self.model.predict(x_test)
            print('r2:', r2_score(prediction, y_test).round(2))

model.fit( ..., callbacks=[MetricsCallback()])

参考资料

Keras 2.2.4


3
就您在 GitHub 上的参考资料所述,self.validation 数据为 None,这个问题尚未得到解决。 - Vadym B.
3
因为自从从fit转换到flow_from_directoryfit_generator后,出现了错误,导致self.validation_data为空。我正在使用fit - B Seven

0

Verdant89 犯了一些错误,没有实现所有的函数。下面的代码应该可以工作。

class Metrics(callbacks.Callback):

def on_train_begin(self, logs={}):
    self.val_f1s = []
    self.val_recalls = []
    self.val_precisions = []

def on_epoch_end(self, epoch, logs):
    # 5.4.1 For each validation batch
    for batch_index in range(0, len(self.validation_data[0])):
        # 5.4.1.1 Get the batch target values
        temp_target = self.validation_data[1][batch_index]
        # 5.4.1.2 Get the batch prediction values
        temp_predict = (np.asarray(self.model.predict(np.expand_dims(
                            self.validation_data[0][batch_index],axis=0)))).round()
        # 5.4.1.3 Append them to the corresponding output objects
        if batch_index == 0:
            val_target = temp_target
            val_predict = temp_predict
        else:
            val_target = np.vstack((val_target, temp_target))
            val_predict = np.vstack((val_predict, temp_predict))

    tp, tn, fp, fn = self.compute_tptnfpfn(val_target, val_predict)
    val_f1 = round(self.compute_f1(tp, tn, fp, fn), 4)
    val_recall = round(self.compute_recall(tp, tn, fp, fn), 4)
    val_precis = round(self.compute_precision(tp, tn, fp, fn), 4)

    self.val_f1s.append(val_f1)
    self.val_recalls.append(val_recall)
    self.val_precisions.append(val_precis)

    # Add custom metrics to the logs, so that we can use them with
    # EarlyStop and csvLogger callbacks
    logs["val_f1"] = val_f1
    logs["val_recall"] = val_recall
    logs["val_precis"] = val_precis

    print("— val_f1: {} — val_precis: {} — val_recall {}".format(
             val_f1, val_precis, val_recall))
    return

def compute_tptnfpfn(self,val_target,val_predict):
    # cast to boolean
    val_target = val_target.astype('bool')
    val_predict = val_predict.astype('bool')

    tp = np.count_nonzero(val_target * val_predict)
    tn = np.count_nonzero(~val_target * ~val_predict)
    fp = np.count_nonzero(~val_target * val_predict)
    fn = np.count_nonzero(val_target * ~val_predict)

    return tp, tn, fp, fn

def compute_f1(self,tp, tn, fp, fn):
    f1 = tp*1. / (tp + 0.5*(fp+fn) + sys.float_info.epsilon)
    return f1

def compute_recall(self,tp, tn, fp, fn):
    recall = tp*1. / (tp + fn + sys.float_info.epsilon)
    return recall

def compute_precision(self,tp, tn, fp, fn):
    precision = tp*1. / (tp + fp + sys.float_info.epsilon)
    return precision

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接