修改TensorFlow图形并恢复训练

Question

修改TensorFlow图形并恢复训练

3

我想加载MCnet模型的预训练权重并恢复训练。这里提供的预训练模型是使用参数K=4，T=7进行训练的。但我需要一个参数为K=4，T=1的模型。我想从这个预训练模型加载权重而不是从头开始训练。但由于图形已更改，我无法加载预训练模型。

InvalidArgumentError (see above for traceback): Restoring from checkpoint failed. This is most likely due to a mismatch between the current graph and the graph from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint. Original error:

Assign requires shapes of both tensors to match. lhs shape= [5,5,15,64] rhs shape= [5,5,33,64]
     [[node save/Assign_13 (defined at /media/nagabhushan/Data02/SNB/IISc/Research/04_Gaming_Video_Prediction/Workspace/VideoPrediction/Literature/01_MCnet/src/snb/mcnet.py:108) ]]

能否使用新图加载预训练模型？

我尝试过的：
以前，我想将预训练模型从旧版tensorflow移植到新版。我在StackOverflow上得到这个答案，帮助我完成了模型移植。其思路是创建新图并从保存的图中加载存在于新图中的变量。

with tf.Session() as sess:
    _ = MCNET(image_size=[240, 320], batch_size=8, K=4, T=1, c_dim=3, checkpoint_dir=None, is_train=True)
    tf.global_variables_initializer().run(session=sess)

    ckpt_vars = tf.train.list_variables(model_path.as_posix())
    ass_ops = []
    for dst_var in tf.get_collection(tf.GraphKeys.GLOBAL_VARIABLES):
        for (ckpt_var, ckpt_shape) in ckpt_vars:
            if dst_var.name.split(":")[0] == ckpt_var and dst_var.shape == ckpt_shape:
                value = tf.train.load_variable(model_path.as_posix(), ckpt_var)
                ass_ops.append(tf.assign(dst_var, value))

    # Assign the variables
    sess.run(ass_ops)
    saver = tf.train.Saver()
    saver.save(sess, save_path.as_posix())

我在这里尝试了相同的操作，并且成功了，也就是说我得到了一个针对K=4,T=1的新训练模型。但我不确定它是否有效！我的意思是，权重是否合理？这是正确的方法吗？

模型信息：
MCnet是一种用于视频预测的模型，即给定K个过去帧，它可以预测接下来的T帧。

非常感谢任何帮助。

- Nagabhushan S N

您可以按原始模型中的形状加载权重，并可能添加零或一（或某些权重初始化器）来填充其余部分。 - learner

这里情况正好相反。我的新模型参数较少（我猜）。由于该模型使用LSTM，我不确定丢弃一些权重是否可行或会对我产生负面影响。 - Nagabhushan S N

哦，我明白了，你能列出权重矩阵吗？这样（如果可能的话）人们就可以有意义地找出要删除的权重。 - learner

你的意思是在这里列出参数吗？我猜可能会太多了。 - Nagabhushan S N

1个回答

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- learner · Accepted Answer

MCnet模型有一个生成器和一个判别器。生成器基于LSTM实现，因此可以通过改变时间步数T来加载权重而不会出现问题。然而，他们编写的判别器是卷积的。为了在视频上应用卷积层，他们将帧沿着通道维度连接。当 K=4,T=7 时，您将获得长度为 11 ，通道数为 3 的视频。将它们沿通道维度连接时，您将获得具有 33 个通道的图像。当他们定义判别器时，他们将判别器的第一层定义为具有 33 个输入通道，因此权重具有相似的维度。但是，当 K=4,T=1 时，视频长度为 5 ，最终图像具有 15 个通道，因此权重将具有 15 个通道。这就是您观察到的不匹配错误。要解决此问题，您可以仅选取前15个通道的权重（缺乏更好的想法）。以下是代码：

with tf.Session() as sess:
    _ = MCNET(image_size=[240, 320], batch_size=8, K=4, T=1, c_dim=3, checkpoint_dir=None, is_train=True)
    tf.global_variables_initializer().run(session=sess)

    ckpt_vars = tf.train.list_variables(model_path.as_posix())
    ass_ops = []
    for dst_var in tf.get_collection(tf.GraphKeys.GLOBAL_VARIABLES):
        for (ckpt_var, ckpt_shape) in ckpt_vars:
            if dst_var.name.split(":")[0] == ckpt_var:
                if dst_var.shape == ckpt_shape:
                    value = tf.train.load_variable(model_path.as_posix(), ckpt_var)
                    ass_ops.append(tf.assign(dst_var, value))
                else:
                    value = tf.train.load_variable(model_path.as_posix(), ckpt_var)
                    if dst_var.shape[2] <= value.shape[2]:
                        adjusted_value = value[:, :, :dst_var.shape[2]]
                    else:
                        adjusted_value = numpy.random.random(dst_var.shape)
                        adjusted_value[:, :, :value.shape[2], ...] = value
                    ass_ops.append(tf.assign(dst_var, adjusted_value))

    # Assign the variables
    sess.run(ass_ops)
    saver = tf.train.Saver()
    saver.save(sess, save_path.as_posix())