Keras模型中的平均权重

Question

Keras模型中的平均权重

tensorflowneural-networkkerasdeep-learningkeras-layer

14

如何在Keras模型中对权重进行平均，当我使用不同的初始化训练相同结构的几个模型时？

现在我的代码大致如下：

datagen = ImageDataGenerator(rotation_range=15,
                             width_shift_range=2.0/28,
                             height_shift_range=2.0/28
                            )

epochs = 40 
lr = (1.234e-3)
optimizer = Adam(lr=lr)

main_input = Input(shape= (28,28,1), name='main_input')

sub_models = []

for i in range(5):

    x = Conv2D(32, kernel_size=(3,3), strides=1)(main_input)
    x = BatchNormalization()(x)
    x = Activation('relu')(x)
    x = MaxPool2D(pool_size=2)(x)

    x = Conv2D(64, kernel_size=(3,3), strides=1)(x)
    x = BatchNormalization()(x)
    x = Activation('relu')(x)
    x = MaxPool2D(pool_size=2)(x)

    x = Conv2D(64, kernel_size=(3,3), strides=1)(x)
    x = BatchNormalization()(x)
    x = Activation('relu')(x)

    x = Flatten()(x)

    x = Dense(1024)(x)
    x = BatchNormalization()(x)
    x = Activation('relu')(x)
    x = Dropout(0.1)(x)

    x = Dense(256)(x)
    x = BatchNormalization()(x)
    x = Activation('relu')(x)
    x = Dropout(0.4)(x)

    x = Dense(10, activation='softmax')(x)

    sub_models.append(x)

x = keras.layers.average(sub_models)

main_output = keras.layers.average(sub_models)

model = Model(inputs=[main_input], outputs=[main_output])

model.compile(loss='categorical_crossentropy', metrics=['accuracy'],
              optimizer=optimizer)

print(model.summary())

plot_model(model, to_file='model.png')

filepath="weights.best.hdf5"
checkpoint = ModelCheckpoint(filepath, monitor='val_acc', verbose=1, save_best_only=True, mode='max')
tensorboard = TensorBoard(log_dir='./Graph', histogram_freq=0, write_graph=True, write_images=True)
callbacks = [checkpoint, tensorboard]

model.fit_generator(datagen.flow(X_train, y_train, batch_size=128),
                    steps_per_epoch=len(X_train) / 128,
                    epochs=epochs,
                    callbacks=callbacks,
                    verbose=1,
                    validation_data=(X_test, y_test))

所以现在我只平均最后一层，但是我想在训练每一层之后平均所有层的权重。

谢谢！

- Miłosz Bednarzak

你不能简单地对神经网络的权重求平均。 - Dr. Snoopy

你目前尝试了什么？如果在每个层之间调用keras.layers.average()会怎样？ - DarkCygnus

不想在每个层之间取平均值，因为我想单独训练每个模型。如果在每个层之后进行平均处理，则会得到不同的结果。同样，如果在训练之前在最后一层对模型进行平均处理，也是不同的。 - Miłosz Bednarzak

@MatiasValdenegro 是的，你可以：https://arxiv.org/abs/1803.05407 - Scratch

1

@Scratch 这篇论文并不支持这个问题所问的想法，它是关于在SGD轨迹上取平均值的，并且是在这个问题被提出之后才出现的。 - Dr. Snoopy

True。对于使用不同初始化训练的模型平均权重几乎没有意义，我只是想指出在某些特定情况下平均权重可能是有用的。 - Scratch

3个回答

10

我无法评论已接受的回答，但为了在tensorflow 2.0和tf.keras上使其工作，我不得不将循环中的列表转换为numpy数组：

new_weights = list()
for weights_list_tuple in zip(*weights): 
    new_weights.append(
        np.array([np.array(w).mean(axis=0) for w in zip(*weights_list_tuple)])
    )

如果需要给不同的输入模型赋予不同的权重，那么np.array(w).mean(axis=0)需要替换为np.average(np.array(w),axis=0, weights=relative_weights)，其中relative_weights是一个数组，其中每个模型都有一个权重因子。

- ursusminimus

我遇到了“TypeError: zip argument #5 must support iteration”错误。为什么会出现这个错误？ - Koti

0

我在TensorFlow/Keras中有一个函数，用于计算多个客户端模型的可训练参数的平均值。平均值是逐层计算的。以下是我正在使用的函数：

def average_client_weights(client_models):
    """
    Compute the average of the trainable parameters across multiple client models.

    This function takes a list of client models and calculates the average of their 
    trainable parameters. The averaging is done layer-wise, meaning that the average 
    for each layer is computed separately and then returned as a list of average weights 
    for each layer.

    Args:
    - client_models (list of objects): A list of objects representing the client models. 
      Each object is expected to have an attribute `trainable_variables` that returns a 
      list of `tf.Variable` objects representing the trainable parameters of the model.

    Returns:
    - avg_weights (list of tf.Tensor): A list of tensors representing the average weights 
      of the trainable parameters of the client models. Each tensor in the list corresponds 
      to the average weight for a specific layer.

    Example:
    If client_models[0].trainable_variables = [W1, b1, W2, b2], where W1, b1, W2, b2 are 
    tensors, then avg_weights = [avg_W1, avg_b1, avg_W2, avg_b2], where avg_W1, avg_b1, 
    avg_W2, avg_b2 are the average weights for each corresponding layer.
    """
    # Retrieve the trainable variables from each client model
    client_weights = [model.trainable_variables for model in client_models]

    # Compute the average weights for each layer
    avg_weights = [
        tf.reduce_mean(layer_weight_tensors, axis=0)
        for layer_weight_tensors in zip(*client_weights)
    ]

    return avg_weights

- Mike th

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Marcin Możejko · Accepted Answer

假设models是一个包含你的模型的集合。首先，收集所有权重：

weights = [model.get_weights() for model in models]

现在 - 创建新的平均权重：

new_weights = list()

for weights_list_tuple in zip(*weights):
    new_weights.append(
        [numpy.array(weights_).mean(axis=0)\
            for weights_ in zip(*weights_list_tuple)])

现在剩下的就是在一个新模型中设置这些权重：

new_model.set_weights(new_weights)

当然，平均权重可能不是一个好主意，但如果你要尝试，你应该按照这种方法进行。