如何在Keras中更改softmax输出的温度

Question

如何在Keras中更改softmax输出的温度

16

我目前正在尝试复现以下文章的结果：

http://karpathy.github.io/2015/05/21/rnn-effectiveness/

我正在使用带有Theano后端的Keras。在这篇文章中，他谈到了控制最终softmax层的温度以提供不同的输出。

温度。我们还可以在采样过程中调整Softmax的温度。将温度从1降低到较低的数字(例如0.5)会使RNN更加自信，但也会在其样本中变得更加保守。相反，更高的温度将提供更多的多样性，但代价是更多的错误(如拼写错误等)。特别地，将温度设置得非常接近零将会给出Paul Graham可能会说的最有可能的话：

我的模型如下所示。

model = Sequential()
model.add(LSTM(128, batch_input_shape = (batch_size, 1, 256), stateful = True, return_sequences = True))
model.add(LSTM(128, stateful = True))
model.add(Dropout(0.1))
model.add(Dense(256, activation = 'softmax'))

model.compile(optimizer = Adam(),
              loss = 'categorical_crossentropy', 
              metrics = ['accuracy'])

我能想到调整最后一层Dense层温度的唯一方法是获取权重矩阵并将其乘以温度。有人知道更好的方法吗？如果有人发现我设置模型时有任何问题，请让我知道，因为我对RNN还很陌生。

- chasep255

3个回答

7

@chasep255的回答是可以的，但由于log（0）而产生了警告。您可以简化操作e^log（a）/T = a^(1/T)，并摆脱对数

def sample(a, temperature=1.0):
  a = np.array(a)**(1/temperature)
  p_sum = a.sum()
  sample_temp = a/p_sum 
  return np.argmax(np.random.multinomial(1, sample_temp, 1))

希望这有所帮助！

- Julian

1

我认为你的意思是 e^(log(a)/T) = a^(1/T)。 - Visionscaper

2

您可以在Keras中构建自定义层以制作温度。

在Keras中的代码如下，并将此层用作Keras中的任何层，例如（Dense）。

class Temperature(keras.layers.Layer):
  def __init__(self):
    super(Temperature, self).__init__()
    self.temperature = torch.nn.Parameter(torch.ones(1))
    
  def call(self, final_output):
    return final_output/ self.temperature

- Rial ALi

2

你应该使用self.add_weight或者tf.Variable。在这个例子中，你正在将Keras层与torch参数混合使用。 - Damian Grzanka

是的，你说得对。 - Rial ALi

应该是：'class Temperature(keras.layers.Layer): def init(self): super(Temperature, self).init() self.temperature = tf.Variable( initial_value = [1.], trainable=True) # self.temperature = tf.ones(1) def call(self, final_output): return final_output/ self.temperature' - Rial ALi

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- chasep255 · Accepted Answer

看起来温度是应用于softmax层输出的某种处理方法。我找到了这个例子。

https://github.com/fchollet/keras/blob/master/examples/lstm_text_generation.py

他使用以下函数对softmax输出进行采样。

def sample(a, temperature=1.0):
    # helper function to sample an index from a probability array
    a = np.log(a) / temperature
    a = np.exp(a) / np.sum(np.exp(a))
    return np.argmax(np.random.multinomial(1, a, 1))