Keras：加权二元交叉熵

Question

Keras：加权二元交叉熵

39

我尝试使用Keras实现加权二元交叉熵，但不确定代码是否正确。训练输出似乎有些混乱。经过几个时期，我的准确度只有约0.15。我认为这太少了（即使是随机猜测）。

一般而言，输出中大约有11％的1和89％的0，因此权重为w_zero=0.89和w_one=0.11。

我的代码：

def create_weighted_binary_crossentropy(zero_weight, one_weight):

    def weighted_binary_crossentropy(y_true, y_pred):

        # Original binary crossentropy (see losses.py):
        # K.mean(K.binary_crossentropy(y_true, y_pred), axis=-1)

        # Calculate the binary crossentropy
        b_ce = K.binary_crossentropy(y_true, y_pred)

        # Apply the weights
        weight_vector = y_true * one_weight + (1. - y_true) * zero_weight
        weighted_b_ce = weight_vector * b_ce

        # Return the mean error
        return K.mean(weighted_b_ce)

    return weighted_binary_crossentropy

或许有人发现了问题？

谢谢

- Kevin Meier

3

为什么不直接使用model.fit的class_weight参数？ - Dr. Snoopy

1

class_weight似乎没有应用于验证数据。这使得验证损失与训练损失的比较变得不太可靠。 - Kevin Meier

更新了吗？已选择的答案解决了您的问题吗？ - SantoshGupta7

7个回答

24

您可以使用sklearn模块像这样自动计算每个类别的权重：

# Import
import numpy as np
from sklearn.utils import class_weight

# Example model
model = Sequential()
model.add(Dense(32, activation='relu', input_dim=100))
model.add(Dense(1, activation='sigmoid'))

# Use binary crossentropy loss
model.compile(optimizer='rmsprop',
              loss='binary_crossentropy',
              metrics=['accuracy'])

# Calculate the weights for each class so that we can balance the data
weights = class_weight.compute_class_weight('balanced',
                                            np.unique(y_train),
                                            y_train)

# Add the class weights to the training                                         
model.fit(x_train, y_train, epochs=10, batch_size=32, class_weight=weights)

请注意，class_weight.compute_class_weight()的输出是一个类似于这样的numpy数组：[2.57569845 0.68250928]。

- tsveti_iko

这是一个不错的替代方案。但是它如何处理验证集？Sklearn只是基于训练数据来进行处理。 - Sarvagya Gupta

5

在model.fit中使用class_weights略有不同：它实际上更新样本而不是计算加权损失。

我还发现，当x作为TFDataset或生成器传入model.fit时，class_weights和sample_weights在TF 2.0.0中被忽略了。但我相信在TF 2.1.0+中已经修复了这个问题。

这是我针对多热编码标签的加权二元交叉熵函数。

import tensorflow as tf
import tensorflow.keras.backend as K
import numpy as np
# weighted loss functions


def weighted_binary_cross_entropy(weights: dict, from_logits: bool = False):
    '''
    Return a function for calculating weighted binary cross entropy
    It should be used for multi-hot encoded labels

    # Example
    y_true = tf.convert_to_tensor([1, 0, 0, 0, 0, 0], dtype=tf.int64)
    y_pred = tf.convert_to_tensor([0.6, 0.1, 0.1, 0.9, 0.1, 0.], dtype=tf.float32)
    weights = {
        0: 1.,
        1: 2.
    }
    # with weights
    loss_fn = get_loss_for_multilabels(weights=weights, from_logits=False)
    loss = loss_fn(y_true, y_pred)
    print(loss)
    # tf.Tensor(0.6067193, shape=(), dtype=float32)

    # without weights
    loss_fn = get_loss_for_multilabels()
    loss = loss_fn(y_true, y_pred)
    print(loss)
    # tf.Tensor(0.52158177, shape=(), dtype=float32)

    # Another example
    y_true = tf.convert_to_tensor([[0., 1.], [0., 0.]], dtype=tf.float32)
    y_pred = tf.convert_to_tensor([[0.6, 0.4], [0.4, 0.6]], dtype=tf.float32)
    weights = {
        0: 1.,
        1: 2.
    }
    # with weights
    loss_fn = get_loss_for_multilabels(weights=weights, from_logits=False)
    loss = loss_fn(y_true, y_pred)
    print(loss)
    # tf.Tensor(1.0439969, shape=(), dtype=float32)

    # without weights
    loss_fn = get_loss_for_multilabels()
    loss = loss_fn(y_true, y_pred)
    print(loss)
    # tf.Tensor(0.81492424, shape=(), dtype=float32)

    @param weights A dict setting weights for 0 and 1 label. e.g.
        {
            0: 1.
            1: 8.
        }
        For this case, we want to emphasise those true (1) label, 
        because we have many false (0) label. e.g. 
            [
                [0 1 0 0 0 0 0 0 0 1]
                [0 0 0 0 1 0 0 0 0 0]
                [0 0 0 0 1 0 0 0 0 0]
            ]

        

    @param from_logits If False, we apply sigmoid to each logit
    @return A function to calcualte (weighted) binary cross entropy
    '''
    assert 0 in weights
    assert 1 in weights

    def weighted_cross_entropy_fn(y_true, y_pred):
        tf_y_true = tf.cast(y_true, dtype=y_pred.dtype)
        tf_y_pred = tf.cast(y_pred, dtype=y_pred.dtype)

        weights_v = tf.where(tf.equal(tf_y_true, 1), weights[1], weights[0])
        weights_v = tf.cast(weights_v, dtype=y_pred.dtype)
        ce = K.binary_crossentropy(tf_y_true, tf_y_pred, from_logits=from_logits)
        loss = K.mean(tf.multiply(ce, weights_v))
        return loss

    return weighted_cross_entropy_fn

- menrfa

在所有答案中，这是最全面的一个。我有一个建议，对于 weights_v 我们应该使用以下代码 weights_v = tf.cast(tf.where(tf.equal(tf_y_true, 1), weights[1], weights[0]), dtype=y_pred.dtype)，以避免 MatMul 数据类型不匹配的问题：（例如 ->

InvalidArgumentError: cannot compute Mul as input #1(zero-based) was expected to be a float tensor but is a double tensor [Op:Mul]

）-> TF Ver 2.8.4 - Anugraha Sinha

3

我认为在 model.fit 中使用 class weight 是不正确的。这里的 0 是指索引，而不是 0 类。 Keras 文档：https://keras.io/models/sequential/ class_weight：可选字典，将类索引（整数）映射到权重（浮点值），用于在训练期间加权损失函数。这可以告诉模型“更多关注”来自少数类别的样本，从而提高少数类别的预测精度。

- Cheng Yang

1

嗯，这似乎表明API没有提供在二元分类问题中对类别进行加权的方法。是这样吗？ - sid-kap

2

@sid-kap，我有完全相同的问题，如果API有任何方法可以为二元分类提供class_weight。你找到答案了吗？ - Naman

如何在二进制分割的情况下使用类权重，其中我们使用二元交叉熵，而我们的标签（掩码）包含浮点值（1.0和0.0）？我们可以将浮点值作为标签，还是标签仅表示索引？ - anilsathyan7

3

您可以按照以下方式计算权重，使用二进制交叉熵。这将以编程方式将 one_weight 设置为 0.11，one 设置为 0.89：

one_weight = (1-num_of_ones)/(num_of_ones + num_of_zeros)
zero_weight = (1-num_of_zeros)/(num_of_ones + num_of_zeros)

def weighted_binary_crossentropy(zero_weight, one_weight):

    def weighted_binary_crossentropy(y_true, y_pred):

        b_ce = K.binary_crossentropy(y_true, y_pred)

        # weighted calc
        weight_vector = y_true * one_weight + (1 - y_true) * zero_weight
        weighted_b_ce = weight_vector * b_ce

        return K.mean(weighted_b_ce)

    return weighted_binary_crossentropy

- Sayan Dey

这很有用，因为在图像分类的情况下，“class_weights”无效。权重的定义存在错误（由于括号而变为负数！）。 - elbe

我怀疑这不正确 -> (1-num_of_ones)/(num_of_ones + num_of_zeros)。它应该像这样 -> 1-(num_of_ones/num_of_ones + num_of_zeros)，对于zero_weight也是同样的情况。 - Anugraha Sinha

2

对我来说，最好的方法是这样的：

def custom_weighted_binary_crossentropy(zero_weight, one_weight):

    def weighted_binary_crossentropy(y_true, y_pred):
        y_true = K.cast(y_true, dtype=tf.float32)

        epsilon = tf.keras.backend.epsilon()
        y_pred = tf.clip_by_value(y_pred, epsilon, 1. - epsilon)

        # Compute cross entropy from probabilities.
        bce = y_true * tf.math.log(y_pred + epsilon)
        bce += (1 - y_true) * tf.math.log(1 - y_pred + epsilon)
        bce = -bce

        # Apply the weights to each class individually
        weight_vector = y_true * one_weight + (1. - y_true) * zero_weight
        weighted_bce = weight_vector * bce

        # Return the mean error
        return tf.reduce_mean(weighted_bce)

    return weighted_binary_crossentropy

- Pablo Pérez-Núñez

0

如果您需要使用不同于训练损失的加权验证损失，您可以使用tensorflow.keras.model.fit()的参数validation_data，将您的验证数据集作为包含验证数据、标签和每个样本权重的Numpy数组元组。

请注意，您将需要使用此技术（这里是按类别）将每个样本映射到其权重。

请点击以下链接： https://www.tensorflow.org/api_docs/python/tf/keras/Model#fit

TensorFlow文档

- Tina

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Yu-Yang · Accepted Answer

通常情况下，少数类的权重会更高。最好使用one_weight=0.89, zero_weight=0.11（顺便说一下，您也可以使用class_weight={0: 0.11, 1: 0.89}，如评论中所建议的）。

在类别不平衡的情况下，您的模型会看到比“1”多得多的“0”。它也会学习更多地预测“0”，因为这样可以通过最小化训练损失来实现。这也是为什么您看到的准确率接近比例0.11的原因。如果对模型预测进行平均，应该非常接近零。

使用类权重的目的是改变损失函数，使训练损失不能通过“简单的解决方案”（即预测“0”）来最小化，这就是为什么最好使用更高的权重来表示“1”的原因。

请注意，最佳权重不一定是0.89和0.11。有时您可能需要尝试一些像取对数或平方根之类的权重（或任何满足one_weight > zero_weight的权重），以使其发挥作用。