使用tensorflow Keras实现自注意力机制,并进行了一些修改(例如,残差(加法连接))。
我的输入形状如下:
myinput:
实现的代码如下:
我的输入形状如下:
myinput:
KerasTensor(type_spec=TensorSpec(shape=(None, 8, 6, 64), dtype=tf.float32, name=None), name='multiply/mul:0', description="created by layer 'multiply'")
我的目标是通过自注意力机制处理TensorSpec(shape=(None, 8, 6, 64)中的8个时间戳(逐个处理6 * 64)并获取每个时间戳的自注意力特征图,然后将其再次连接到输出张量形状(None, 8, 6, 64)中。实现的代码如下:
import tensorflow as tf
from tensorflow.keras.layers import Permute
def conv1d(channels, ks=1, strides=1, padding='same'):
conv = tf.keras.layers.Conv1D(channels, ks, strides, padding, activation='relu', use_bias=False,
kernel_initializer='HeNormal')
return conv
class my_self_attention(tf.keras.layers.Layer):
def __init__(self, channels):
super(my_self_attention, self).__init__()
self.query = conv1d(channels)
self.key = conv1d(channels)
self.value = conv1d(channels)
self.gamma = tf.compat.v1.get_variable("gamma", [1], initializer=tf.constant_initializer(0.0))
def call(self, x):
x = tf.reshape(x, shape=[-1, x.shape[2], x.shape[3]])
f = self.query(x),
g = self.key(x)
h = self.value(x)
attention_weights = tf.keras.activations.softmax(
tf.matmul(g, Permute((2, 1))(f))) # query multiply with key and then softmax on it
sensor_att_fm = tf.matmul(attention_weights, h)
o = self.gamma * sensor_att_fm + x
# return tf.reshape(o, shape = [-1, 1, x.shape[1], x.shape[2]])
return tf.reshape(o, shape=[-1, 1, x.shape[1], x.shape[2]])
sa = my_self_attention(channels)
refined_fm = tf.concat([sa(tf.expand_dims(my_input[:, t, :, :], 1)) for t in range(my_input.shape[1])], 1)
遇到以下错误
ValueError: Dimension must be 4 but is 3 for '{{node my_self_attention/permute/transpose}} = Transpose[T=DT_FLOAT, Tperm=DT_INT32](my_self_attention/permute/transpose/a, my_self_attention/permute/transpose/perm)' with input shapes: [1,?,6,64], [3].
我该如何解决这个问题?