tensorflow中的tf.space_to_depth()是如何工作的？

Question

tensorflow中的tf.space_to_depth()是如何工作的？

7

我是一个PyTorch用户。我有一个在TensorFlow中预训练的模型，我想将其转换为PyTorch。在模型架构的某个部分，也就是在TensorFlow定义的模型中，有一个函数tf.space_to_depth，它将输入大小从（None，38,38,64）转换为（None，19,19,256）。(https://www.tensorflow.org/api_docs/python/tf/space_to_depth) 是该函数的文档。但我不明白这个函数实际上是做什么的。你能提供一些numpy代码来为我说明吗？

实际上，我想在PyTorch中创建一个完全相同的层。

TensorFlow中的一些代码揭示了另一个秘密：以下是一些代码：

import numpy as np
import tensorflow as tf

norm = tf.random_normal([1, 2, 2, 1], mean=0, stddev=1)
trans = tf.space_to_depth(norm,2)

with tf.Session() as s:
    norm = s.run(norm)
    trans = s.run(trans)



print("Norm")
print(norm.shape)
for index,value in np.ndenumerate(norm):
    print(value)

print("Trans")
print(trans.shape)
for index,value in np.ndenumerate(trans):
    print(value)

以下是输出结果:

Norm
(1, 2, 2, 1)
0.695261
0.455764
1.04699
-0.237587
Trans
(1, 1, 1, 4)
1.01139
0.898777
0.210135
2.36742

正如您在上面看到的，除了数据重塑之外，张量值已经发生了变化！

- Moahammad mehdi

2

你的值可能会发生变化，因为你必须为norm和trans分别调用session.run，以便在调用之间生成的随机值不同。 - Allan Zelener

那么，请尝试使用 norm, trans = s.run([norm, trans])。 - Allan Zelener

5个回答

4

这个tf.space_to_depth函数将您的输入分成块并将它们连接起来。

在您的示例中，输入为38x38x64（我猜block_size为2）。因此，该函数将您的输入分成4个（block_size x block_size）并将它们连接起来，从而得到19x19x256的输出。

您只需要将每个通道（输入）分成block_size * block_size个补丁（每个补丁的大小为width/block_size x height/block_size），然后将所有这些补丁连接起来。使用numpy应该非常简单。

希望能对您有所帮助。

- A. Piro

谢谢你的回答。但我认为进行了一些数据重排！你在文档中看到了吗！我认为这不仅仅是除法！ - Moahammad mehdi

我在tensorflow文档中没有看到任何洗牌的内容（https://www.tensorflow.org/api_docs/python/tf/space_to_depth）..但也许有 :) - A. Piro

再次您好。感谢您的回复。您能否提供一些numpy代码来执行这个除法操作呢？我在tensorflow和numpy方面真的很新手。我希望能够得到与tf.space_to_depth完全相同的结果。拜托了！ - Moahammad mehdi

2

使用Pytorch中的split、stack和permute函数可以实现与tensorflow中的space_to_depth相同的结果。以下是Pytorch代码示例，假设输入数据格式为BHWC。

根据块大小和输入形状，我们可以计算输出形状。首先，它将输入沿着“宽度”维度或第2个维度按块大小分割。这个操作的结果是一个长度为d_width的数组。就像你把一个蛋糕（按块大小）切成d_width块一样。然后对于每一块，将其重新调整形状，使其具有正确的输出高度和输出深度（通道）。最后，我们将这些块堆叠在一起并执行排列操作。

希望这能帮到您。

def space_to_depth(input, block_size)
    block_size_sq = block_size*block_size
    (batch_size, s_height, s_width, s_depth) = input.size()
    d_depth = s_depth * self.block_size_sq
    d_width = int(s_width / self.block_size)
    d_height = int(s_height / self.block_size)
    t_1 = input.split(self.block_size, 2)
    stack = [t_t.contiguous().view(batch_size, d_height, d_depth) for t_t in t_1]
    output = torch.stack(stack, 1)
    output = output.permute(0, 2, 1, 3)
    return output

- Lola

你能解释一下为什么有人需要这样的操作吗？ - Abhay

2

一个很好的PyTorch参考资料是PixelShuffle模块的实现，可以在这里找到。这展示了类似于Tensorflow的depth_to_space的实现。根据这个，我们可以使用小于1的缩放因子实现像space_to_depth的pixel_shuffle。例如，downscale_factor=0.5就相当于block_size=2的space_to_depth。

def pixel_shuffle_down(input, downscale_factor):
    batch_size, channels, in_height, in_width = input.size()
    out_channels = channels / (downscale_factor ** 2)
    block_size = 1 / downscale_factor

    out_height = in_height * downscale_factor
    out_width = in_width * downscale_factor

    input_view = input.contiguous().view(
        batch_size, channels, out_height, block_size, out_width, block_size)

    shuffle_out = input_view.permute(0, 1, 3, 5, 2, 4).contiguous()
    return shuffle_out.view(batch_size, out_channels, out_height, out_width)

注意：我还没有验证这个实现，也不确定它是否完全是像素混洗的反函数，但这是基本思路。我还在PyTorch Github上开了一个问题，关于这个问题，请点击这里。在NumPy中，等价的代码将使用reshape和transpose，而不是view和permute。

- Allan Zelener

感谢您的回复！ - Moahammad mehdi

-1

或许这一个可行：

sudo apt install nvidia-cuda-toolkit

这对我有用。

- Ali Ganjbakhsh

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- davidwangv5 · Accepted Answer

结论：tf.space_to_depth()仅仅是将输入张量中的高度和宽度维度的值移到深度维度的副本输出。

如果您稍微修改一下代码，就像这样

norm = tf.random_normal([1, 2, 2, 1], mean=0, stddev=1)

with tf.Session() as s:
    norm = s.run(norm)

trans = tf.space_to_depth(norm,2)

with tf.Session() as s:
    trans = s.run(trans)

那么你将得到以下结果：

Norm
(1, 2, 2, 1)
-0.130227
2.04587
-0.077691
-0.112031
Trans
(1, 1, 1, 4)
-0.130227
2.04587
-0.077691
-0.112031

希望这能帮助你。