不平衡数据和加权交叉熵

Question

不平衡数据和加权交叉熵

pythonmachine-learningtensorflowdeep-learning

71

我正在尝试使用不平衡的数据来训练网络。我有A（198个样本），B（436个样本），C（710个样本），D（272个样本），并且我已经了解了“weighted_cross_entropy_with_logits”的相关知识，但是我找到的所有示例都是针对二元分类的，因此我对如何设置这些权重不太自信。

总样本数：1616

A_weight：198/1616 = 0.12？

如果我理解正确，其背后的想法是惩罚大多数类别的错误，并更加积极地评价少数类别的命中，对吗？

我的代码片段：

weights = tf.constant([0.12, 0.26, 0.43, 0.17])
cost = tf.reduce_mean(tf.nn.weighted_cross_entropy_with_logits(logits=pred, targets=y, pos_weight=weights))

我已经阅读了这篇文章以及其他二元分类的例子，但仍不是很清楚。

- Sergiodiaz53

我想了解为什么会有加权交叉熵，它的作用是什么，背后的思想究竟是什么。你读过哪些相关资料？我找不到很多资源。 - haneulkim

4个回答

4

请查看此答案，获取适用于 sparse_softmax_cross_entropy 的替代解决方案：

import  tensorflow as tf
import numpy as np

np.random.seed(123)
sess = tf.InteractiveSession()

# let's say we have the logits and labels of a batch of size 6 with 5 classes
logits = tf.constant(np.random.randint(0, 10, 30).reshape(6, 5), dtype=tf.float32)
labels = tf.constant(np.random.randint(0, 5, 6), dtype=tf.int32)

# specify some class weightings
class_weights = tf.constant([0.3, 0.1, 0.2, 0.3, 0.1])

# specify the weights for each sample in the batch (without having to compute the onehot label matrix)
weights = tf.gather(class_weights, labels)

# compute the loss
tf.losses.sparse_softmax_cross_entropy(labels, logits, weights).eval()

- DankMasterDan

1

因为这个答案有-1票，所以我点了赞。我认为这个答案至少应该有0票，因为它让我发现了tf.gather，这使得我的代码非常高效，因为我有稀疏标签而不是密集标签。 - AneesAhmed777

1

@DankMasterDan：链接可以提供信用和上下文，但请将引用的代码复制粘贴到您的答案中，以使其自给自足。 - AneesAhmed777

4

Tensorflow 2.0兼容答案：为了造福社区，将P-Gn的回答中指定的代码迁移至2.0版本。

# your class weights
class_weights = tf.compat.v2.constant([[1.0, 2.0, 3.0]])
# deduce weights for batch samples based on their true label
weights = tf.compat.v2.reduce_sum(class_weights * onehot_labels, axis=1)
# compute your (unweighted) softmax cross entropy loss
unweighted_losses = tf.compat.v2.nn.softmax_cross_entropy_with_logits(onehot_labels, logits)
# apply the weights, relying on broadcasting of the multiplication
weighted_losses = unweighted_losses * weights
# reduce the result to get your final loss
loss = tf.reduce_mean(weighted_losses)

有关将代码从Tensorflow版本1.x迁移到2.x的更多信息，请参阅迁移指南。

- user11530462

0

你实际上可以保留分类交叉熵损失，并使用class_weight参数进行训练。description中提到：

可选的字典，将类别索引（整数）映射到权重值（浮点数），用于加权损失函数（仅在训练期间）。这对于告诉模型“更关注”来自少数类别的样本非常有用。当指定了class_weight并且目标的秩为2或更高时，y必须是one-hot编码，或者稀疏类别标签必须包含一个显式的最后维度为1。

我使用了total_samples / (2 * class_occurences)并且有效，这相当于将你的weights列表除以2，但是你的weights列表也应该能起作用，只需检查哪个值对你最好。

关于处理不平衡数据，这里有一个很好的TF教程here。

- J Agustin Barrachina

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- P-Gn · Accepted Answer

请注意，weighted_cross_entropy_with_logits 是 sigmoid_cross_entropy_with_logits 的加权变体。Sigmoid 交叉熵通常用于二元分类。是的，它可以处理多个标签，但 sigmoid 交叉熵基本上针对每个标签做出 (二进制) 决策 -- 例如，对于人脸识别网络来说，那些 (不互斥的) 标签可以是 "被拍摄者是否戴眼镜？"、"被拍摄者是否为女性？" 等。

在二元分类中，每个输出通道对应一个二进制（软）决策。因此，加权需要在损失计算中进行。这就是 weighted_cross_entropy_with_logits 所做的事情，通过对交叉熵的一个项进行加权，达到这个目的。

在相互排斥的多标签分类中，我们使用 softmax_cross_entropy_with_logits，其行为不同：每个输出通道对应一个类别候选的分数。决策在之后进行，方法是比较每个通道的相应输出。

因此，在最终决策之前进行加权是一件简单的事情，可以通过修改分数来实现，通常是乘以权重。例如，对于三元分类任务，

# your class weights
class_weights = tf.constant([[1.0, 2.0, 3.0]])
# deduce weights for batch samples based on their true label
weights = tf.reduce_sum(class_weights * onehot_labels, axis=1)
# compute your (unweighted) softmax cross entropy loss
unweighted_losses = tf.nn.softmax_cross_entropy_with_logits(onehot_labels, logits)
# apply the weights, relying on broadcasting of the multiplication
weighted_losses = unweighted_losses * weights
# reduce the result to get your final loss
loss = tf.reduce_mean(weighted_losses)

您还可以依靠 tf.losses.softmax_cross_entropy 来处理最后三个步骤。

在您需要解决数据不平衡的情况下，类别权重确实可能与其在训练数据中出现频率成反比例关系。将它们归一化，使它们总和为一或等于类别数量也是有意义的。

请注意，在上述情况下，我们基于样本的真实标签对损失进行惩罚。我们也可以基于估计标签对损失进行惩罚，只需简单地定义

weights = class_weights

由于广播魔法的作用，其余代码无需更改。

在一般情况下，您需要权重依赖于您所犯错误的类型。换句话说，对于每对标签X和标签Y，您可以选择如何惩罚选择标签X而真实标签是Y。最终会得到一个完整的先验权重矩阵，结果就是上面的weights是一个完整的(num_samples, num_classes)张量。这有点超出了您的要求，但这可能还是有用的，因为上面的代码中只需要更改权重张量的定义。