使用子类模型时,model.summary()无法打印输出形状。

29

这是创建keras模型的两种方法,但是这两种方法的summary结果中输出形状是不同的。显然,前一种方法打印了更多信息,更容易检查网络的正确性。

import tensorflow as tf
from tensorflow.keras import Input, layers, Model

class subclass(Model):
    def __init__(self):
        super(subclass, self).__init__()
        self.conv = layers.Conv2D(28, 3, strides=1)

    def call(self, x):
        return self.conv(x)


def func_api():
    x = Input(shape=(24, 24, 3))
    y = layers.Conv2D(28, 3, strides=1)(x)
    return Model(inputs=[x], outputs=[y])

if __name__ == '__main__':
    func = func_api()
    func.summary()

    sub = subclass()
    sub.build(input_shape=(None, 24, 24, 3))
    sub.summary()
输出:
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_1 (InputLayer)         (None, 24, 24, 3)         0         
_________________________________________________________________
conv2d (Conv2D)              (None, 22, 22, 28)        784       
=================================================================
Total params: 784
Trainable params: 784
Non-trainable params: 0
_________________________________________________________________
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d_1 (Conv2D)            multiple                  784       
=================================================================
Total params: 784
Trainable params: 784
Non-trainable params: 0
_________________________________________________________________

那么,我应该如何使用子类化方法在summary()中获取输出形状

8个回答

32

我用这种方法解决了这个问题,不知道是否有更简单的方法。

class subclass(Model):
    def __init__(self):
        ...
    def call(self, x):
        ...

    def model(self):
        x = Input(shape=(24, 24, 3))
        return Model(inputs=[x], outputs=self.call(x))



if __name__ == '__main__':
    sub = subclass()
    sub.model().summary()

2
你能解释一下为什么这个代码可以工作吗?特别是 outputs=self.call(x) 这部分。 - Gilfoyle
4
通过评估 outputs=self.call(x),会调用 subclass.call(self, x) 方法。这会触发封装实例中的形状计算。此外,Model 返回的实例也会计算自己的形状,并在 .summary() 中报告。这种方法的主要问题是输入形状是常量 shape=(24, 24, 3),因此如果需要动态解决方案,这种方法无法使用。 - Rob Hall
你能解释一下 ... 中放了什么吗?这是一个通用的解决方案还是需要在这些调用中使用特定于模型的东西? - GuySoft
1
@GuySoft ... 在 init 中实例化您的层,而在 call 中连接不同的层以构建网络。这适用于所有子类化的 Keras 模型。 - DeWil
那么直接覆盖 summary 函数不是更好/更容易吗?即,def summary(self): x = ... dummy_model = Model(... dummy_model.summary。你甚至可以将形状作为新摘要函数的输入变量。 - Lu Kas
我评论中的最后一行代码应该是 dummy_model.summary() - Lu Kas

9
我解决问题的方式与Elazar提到的非常相似。在subclass类中覆盖函数summary(),然后您可以在使用模型子类时直接调用summary()函数。
class subclass(Model):
    def __init__(self):
        ...
    def call(self, x):
        ...

    def summary(self):
        x = Input(shape=(24, 24, 3))
        model = Model(inputs=[x], outputs=self.call(x))
        return model.summary()

if __name__ == '__main__':
    sub = subclass()
    sub.summary()

有没有比Elazar的解决方案更好的优势?我喜欢你的方法,因为它更简洁。 - MOON
1
@MOON 我相信它们只是一样的,只不过这个更整洁。这并不是任何tf技巧,只是Python中一些基本的面向对象编程技术。 - LIU Qingyuan

7
我认为关键点在于类 Network 中的方法 _init_graph_network,该类是Model的父类。如果在调用__init__方法时指定了inputsoutputs参数,将会调用_init_graph_network方法。
因此,有两种可能的方法:
  1. 手动调用_init_graph_network方法构建模型的图形。
  2. 重新初始化输入层和输出。
这两种方法都需要输入层和输出(从self.call中获取)。现在调用summary将给出精确的输出形状。然而,它将显示Input层,但这不是子类化Model的一部分。
from tensorflow import keras
from tensorflow.keras import layers as klayers

class MLP(keras.Model):
    def __init__(self, input_shape=(32), **kwargs):
        super(MLP, self).__init__(**kwargs)
        # Add input layer
        self.input_layer = klayers.Input(input_shape)

        self.dense_1 = klayers.Dense(64, activation='relu')
        self.dense_2 = klayers.Dense(10)

        # Get output layer with `call` method
        self.out = self.call(self.input_layer)

        # Reinitial
        super(MLP, self).__init__(
            inputs=self.input_layer,
            outputs=self.out,
            **kwargs)

    def build(self):
        # Initialize the graph
        self._is_graph_network = True
        self._init_graph_network(
            inputs=self.input_layer,
            outputs=self.out
        )

    def call(self, inputs):
        x = self.dense_1(inputs)
        return self.dense_2(x)

if __name__ == '__main__':
    mlp = MLP(16)
    mlp.summary()

输出结果将为:
Model: "mlp_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_1 (InputLayer)         [(None, 16)]              0         
_________________________________________________________________
dense (Dense)                (None, 64)                1088      
_________________________________________________________________
dense_1 (Dense)              (None, 10)                650       
=================================================================
Total params: 1,738
Trainable params: 1,738
Non-trainable params: 0
_________________________________________________________________

更新:为了实现上述功能,您无需重新定义build() - Ryukendo Dey

5
我分析了Adi Shumely的答案:
  • 添加Input_shape应该不需要,因为您已经在build()中将其设置为参数
  • 添加Input层对模型没有影响,它只是作为参数传递给call()方法
  • 我认为添加所谓的输出并不是正确的方式。它唯一且最重要的作用就是调用call()方法。
因此,我提出了这个解决方案,它不需要修改模型,只需要在调用summary()方法之前通过添加一个Input张量的call()方法来改进模型。 我已经在我的模型和这个反馈中介绍的三个模型上尝试过了,目前它都可以正常工作。
来自这个反馈的第一篇帖子:
import tensorflow as tf
from tensorflow.keras import Input, layers, Model

class subclass(Model):
    def __init__(self):
        super(subclass, self).__init__()
        self.conv = layers.Conv2D(28, 3, strides=1)

    def call(self, x):
        return self.conv(x)

if __name__ == '__main__':
    sub = subclass()
    sub.build(input_shape=(None, 24, 24, 3))

    # Adding this call to the call() method solves it all
    sub.call(Input(shape=(24, 24, 3)))

    # And the summary() outputs all the information
    sub.summary()

从动态的第二篇文章开始

from tensorflow import keras
from tensorflow.keras import layers as klayers

class MLP(keras.Model):
    def __init__(self, **kwargs):
        super(MLP, self).__init__(**kwargs)
        self.dense_1 = klayers.Dense(64, activation='relu')
        self.dense_2 = klayers.Dense(10)

    def call(self, inputs):
        x = self.dense_1(inputs)
        return self.dense_2(x)

if __name__ == '__main__':
    mlp = MLP()
    mlp.build(input_shape=(None, 16))
    mlp.call(klayers.Input(shape=(16)))
    mlp.summary()

从动态的最后一篇帖子开始

import tensorflow as tf
class MyModel(tf.keras.Model):
    def __init__(self, **kwargs):
        super(MyModel, self).__init__(**kwargs) 
        self.dense10 = tf.keras.layers.Dense(10, activation=tf.keras.activations.softmax)    
        self.dense20 = tf.keras.layers.Dense(20, activation=tf.keras.activations.softmax)
    
    def call(self, inputs):
        x =  self.dense10(inputs)
        y_pred =  self.dense20(x)
        return y_pred

model = MyModel()
model.build(input_shape = (None, 32, 32, 1))
model.call(tf.keras.layers.Input(shape = (32, 32, 1)))
model.summary()

2

我曾经遇到过同样的问题 - 通过以下3个步骤来解决它:

  1. 在 _ init _ 中添加 input_shape
  2. 添加一个 input_layer
  3. 添加一个 output_layer
class MyModel(tf.keras.Model):
    
    def __init__(self,input_shape=(32,32,1), **kwargs):
        super(MyModel, self).__init__(**kwargs) 
        self.input_layer = tf.keras.layers.Input(input_shape)
        self.dense10 = tf.keras.layers.Dense(10, activation=tf.keras.activations.softmax)    
        self.dense20 = tf.keras.layers.Dense(20, activation=tf.keras.activations.softmax)
        self.out = self.call(self.input_layer)    
    
    def call(self, inputs):
        x =  self.dense10(inputs)
        y_pred =  self.dense20(x)
     
        return y_pred

model = MyModel()
model(x_test[:99])
print('x_test[:99].shape:',x_test[:10].shape)
model.summary()

输出:

x_test[:99].shape: (99, 32, 32, 1)
Model: "my_model_32"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense_79 (Dense)             (None, 32, 32, 10)        20        
_________________________________________________________________
dense_80 (Dense)             (None, 32, 32, 20)        220       
=================================================================
Total params: 240
Trainable params: 240
Non-trainable params: 0


1
我已经使用这种方法来解决问题,在tensorflow 2.1和tensorflow 2.4.1上进行了测试。使用 model.inputs_layer 声明 InputLayer。
class Logistic(tf.keras.models.Model):
    def __init__(self, hidden_size = 5, output_size=1, dynamic=False, **kwargs):
        '''
        name: String name of the model.
        dynamic: (Subclassed models only) Set this to `True` if your model should
            only be run eagerly, and should not be used to generate a static
            computation graph. This attribute is automatically set for Functional API
            models.
        trainable: Boolean, whether the model's variables should be trainable.
        dtype: (Subclassed models only) Default dtype of the model's weights (
            default of `None` means use the type of the first input). This attribute
            has no effect on Functional API models, which do not have weights of their
            own.
        '''
        super().__init__(dynamic=dynamic, **kwargs)
        self.inputs_ = tf.keras.Input(shape=(2,), name="hello")
        self._set_input_layer(self.inputs_)
        self.hidden_size = hidden_size
        self.dense = layers.Dense(hidden_size, name = "linear")
        self.outlayer = layers.Dense(output_size, 
                        activation = 'sigmoid', name = "out_layer")
        self.build()
        

    def _set_input_layer(self, inputs):
        """add inputLayer to model and display InputLayers in model.summary()

        Args:
            inputs ([dict]): the result from `tf.keras.Input`
        """
        if isinstance(inputs, dict):
            self.inputs_layer = {n: tf.keras.layers.InputLayer(input_tensor=i, name=n) 
                                    for n, i in inputs.items()}
        elif isinstance(inputs, (list, tuple)):
            self.inputs_layer = [tf.keras.layers.InputLayer(input_tensor=i, name=i.name) 
                                    for i in inputs]
        elif tf.is_tensor(inputs):
            self.inputs_layer = tf.keras.layers.InputLayer(input_tensor=inputs, name=inputs.name)
    
    def build(self):
        super(Logistic, self).build(self.inputs_.shape if tf.is_tensor(self.inputs_) else self.inputs_)
        _ = self.call(self.inputs_)
    

    def call(self, X):
        X = self.dense(X)
        Y = self.outlayer(X)
        return Y

model = Logistic()
model.summary()

Model: "logistic"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
hello:0 (InputLayer)         [(None, 2)]               0         
_________________________________________________________________
linear (Dense)               (None, 5)                 15        
_________________________________________________________________
out_layer (Dense)            (None, 1)                 6         
=================================================================
Total params: 21
Trainable params: 21
Non-trainable params: 0
_________________________________________________________________

1

我在你的代码中只添加了一行(如下)。

self.call(Input(shape=(24, 24, 3)))

我的代码是:

import tensorflow as tf
from tensorflow.keras import Input, layers, Model

class subclass(Model):
    def __init__(self):
        super(subclass, self).__init__()
        self.conv = layers.Conv2D(28, 3, strides=1)
    
        # add this code
        self.call(Input(shape=(24, 24, 3)))

    def call(self, x):
        return self.conv(x)


def func_api():
    x = Input(shape=(24, 24, 3))
    y = layers.Conv2D(28, 3, strides=1)(x)
    return Model(inputs=[x], outputs=[y])

if __name__ == '__main__':
    func = func_api()
    func.summary()

    sub = subclass()
    sub.build(input_shape=(None, 24, 24, 3))
    sub.summary()

结果

Model: "model"
_________________________________________________________________
Layer (type)                 Output Shape              Param #
=================================================================
input_1 (InputLayer)         [(None, 24, 24, 3)]       0
_________________________________________________________________
conv2d (Conv2D)              (None, 22, 22, 28)        784
=================================================================
Total params: 784
Trainable params: 784
Non-trainable params: 0
_________________________________________________________________
Model: "subclass"
_________________________________________________________________
Layer (type)                 Output Shape              Param #
=================================================================
conv2d_1 (Conv2D)            (None, 22, 22, 28)        784
=================================================================
Total params: 784
Trainable params: 784
Non-trainable params: 0                
_______________________________________________________________

0

Gary的回答是可行的。但是,为了更方便,我想从我的自定义类对象中透明地访问keras.Modelsummary方法。

这可以通过实现内置的__getattr__方法(更多信息可以在官方Python文档中找到)轻松完成,如下所示:

from tensorflow.keras import Input, layers, Model

class MyModel():
    def __init__(self):
        self.model = self.get_model()

    def get_model(self):
        # here we use the usual Keras functional API
        x = Input(shape=(24, 24, 3))
        y = layers.Conv2D(28, 3, strides=1)(x)
        return Model(inputs=[x], outputs=[y])

    def __getattr__(self, name):
        """
        This method enables to access an attribute/method of self.model.
        Thus, any method of keras.Model() can be used transparently from a MyModel object
        """
        return getattr(self.model, name)


if __name__ == '__main__':
    mymodel = MyModel()
    mymodel.summary()  # underlyingly calls MyModel.model.summary()

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接