Caffe：带温度参数的Softmax

Question

Caffe：带温度参数的Softmax

neural-networkdeep-learningcaffeconv-neural-networksoftmax

4

我正在实现Hinton的知识蒸馏论文。第一步是使用更高温度的“笨重模型”的软目标（即每个图像的前向传递并存储具有温度T的软目标）。

是否有一种方法可以获取Alexnet或googlenet的软目标，但具有不同的温度？

我需要修改softmax函数为pi= exp(zi/T)/sum(exp(zi/T)。

需要将最终完全连接层的输出除以温度T。我只需要这个用于前向传递（不用于训练）。

- Sid M

发布了指向Hinton论文的链接 @Shai - Sid M

1个回答

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Shai · Accepted Answer

我认为有三种解决这个问题的选项。

1. 实现一个带有温度参数的自定义softmax层。很容易修改softmax_layer.cpp的代码来考虑温度T。你可能需要微调caffe.proto，以允许解析带有额外参数的Softmax层。 2. 将该层实现为python层。 3. 如果您只需要前向传递，即“提取特征”，那么您可以简单地输出softmax层之前层的“top”作为特征，并在外部使用温度进行softmax。

4. 在顶部的Softmax层之前，您可以添加Scale层：

layer {
  type: "Scale"
  name: "temperature"
  bottom: "zi"
  top: "zi/T"
  scale_param { 
    filler: { type: 'constant' value: 1/T }  # replace "1/T" with the actual value of 1/T.
  }
  param { lr_mult: 0 decay_mult: 0 } # make sure temperature is fixed
}
layer {
  type: "Softmax"
  name: "prob"
  bottom: "zi/T"
  top: "pi"
}