如何高效地转置一个二维位矩阵

13

我经常遇到这个问题(例如在这个问题中)。给定一个二维位矩阵/板/数组,形式为基本整数类型的数组,例如一个long值的数组。为简单起见,我们可以假设一个正方形矩阵,例如在具有64位long的平台上的64个long值的数组。

x [i]表示输入数组,其中0 <= i < 64。计算一个数组y [i],其中0 <= i <= 64,使得:

(x[i] >> j) & 1 == (y[j] >> i) & 1

x >> i 表示对变量 x 进行按位右移 i 位,& 表示按位与操作,x[i] 表示数组 x 中第 i 个位置的值。

如何最有效地实现一个将数组 x 映射到数组 y 的函数?

主要寻找非破坏性方法,使输入数组 x 保持不变。

实现语言

所使用的编程语言应当拥有整型数组和整型按位操作。许多语言都满足这些需求,我们选择 C/C++ 和 Java。


1
这听起来像是一份作业或考试,而且非常高级。我建议你自己回答它;你的声誉表明你已经足够先进了。祝你好运! - Paul Ogilvie
3
@Paul 我可以向你保证这不是一道作业题。这是我时常遇到的问题(请看添加的链接)。如果没有人在这里回答,我可能会自己提交一个解决方案。我认为这个问题具有普遍的兴趣,值得在SO上进行问答。 - ziggystar
4
各位,请不要因为我知道如何编写好的问题/规范就认为这是作业。这真的很令人沮丧。 - ziggystar
1
这个怎么样:https://dev59.com/lVnUa4cB1Zd3GeqPe-Um#6932331 - Malt
@ziggystar:你的用户等级很高,应该知道这种问题不是适合在SO上提问的具体编码问题。太宽泛了。这更适合在softwareengineering.stackexchange.com或cs.stackexchange.com上提问。 - Seki
显示剩余11条评论
2个回答

10
这似乎是一个关于8字节按位转置问题的概括。那个问题只涉及8x8的矩阵转置,所以你提出的问题有点不同。但是你的问题同样可以在书籍Hacker's Delight的第7.3节中得到解答(你也许可以在谷歌图书上查看相关页面)。在那里介绍的代码显然来自Guy Steele

Hacker's Delight网站仅包含书中8x832x32情况的源代码,但后者可以轻松地推广到64x64的情况:

#include <stdint.h>

void
transpose64(uint64_t a[64]) {
  int j, k;
  uint64_t m, t;

  for (j = 32, m = 0x00000000FFFFFFFF; j; j >>= 1, m ^= m << j) {
    for (k = 0; k < 64; k = ((k | j) + 1) & ~j) {
      t = (a[k] ^ (a[k | j] >> j)) & m;
      a[k] ^= t;
      a[k | j] ^= (t << j);
    }
  }
}

这个函数的工作方式是交换逐渐变小的比特块,从32x32的块开始(不转置这些块内部的位),然后在这些32x32的块内部交换适当的16x16块等等。保存块大小的变量为j。因此,外循环使得j依次取32、16、8、4、2和1的值,这意味着外循环运行六次。内循环在您的位的一半行上运行,其中给定变量k中的某个位等于零。当j为32时,它们是0-31行,当j为16时,它们是0-15和32-47行,等等。内部循环部分总共运行了6*32=192次。在这个内部部分发生的事情是,掩码m决定应该交换哪些位,在t中计算这些位的异或,然后使用这个异或列表来适当地更新两个位置的位。
这本书(以及网站)还有一个版本的代码,其中这些循环都已经展开,并且掩码m没有被计算,而只是被赋值。我猜这取决于寄存器的数量和指令缓存的大小,是否会有所改进?
为了测试这个工作是否正常,假设我们定义了一些位模式,比如:
uint64_t logo[] = {
0b0000000000000000000000000000000000000000000100000000000000000000,
0b0000000000000000000000000000000000000000011100000000000000000000,
0b0000000000000000000000000000000000000000111110000000000000000000,
0b0000000000000000000000000000000000000001111111000000000000000000,
0b0000000000000000000000000000000000000000111111100000000000000000,
0b0000000000000000000000000000000000000000111111100000000000000000,
0b0000000000000000000000000000000000000000011111110000000000000000,
0b0000000000000000000000000000000000000000001111111000000000000000,
0b0000000000000000000000000000000000000000001111111100000000000000,
0b0000000000000000000000000000000010000000000111111100000000000000,
0b0000000000000000000000000000000011100000000011111110000000000000,
0b0000000000000000000000000000000111110000000001111111000000000000,
0b0000000000000000000000000000001111111000000001111111100000000000,
0b0000000000000000000000000000011111111100000000111111100000000000,
0b0000000000000000000000000000001111111110000000011111110000000000,
0b0000000000000000000000000000000011111111100000001111111000000000,
0b0000000000000000000000000000000001111111110000001111111100000000,
0b0000000000000000000000000000000000111111111000000111111100000000,
0b0000000000000000000000000000000000011111111100000011111110000000,
0b0000000000000000000000000000000000001111111110000001111111000000,
0b0000000000000000000000000000000000000011111111100001111111100000,
0b0000000000000000000000001100000000000001111111110000111111100000,
0b0000000000000000000000001111000000000000111111111000011111110000,
0b0000000000000000000000011111110000000000011111111100001111100000,
0b0000000000000000000000011111111100000000001111111110001111000000,
0b0000000000000000000000111111111111000000000011111111100110000000,
0b0000000000000000000000011111111111110000000001111111110000000000,
0b0000000000000000000000000111111111111100000000111111111000000000,
0b0000000000000000000000000001111111111111100000011111110000000000,
0b0000000000000000000000000000011111111111111000001111100000000000,
0b0000000000000000000000000000000111111111111110000011000000000000,
0b0000000000000000000000000000000001111111111111100000000000000000,
0b0000000000000000000000000000000000001111111111111000000000000000,
0b0000000000000000000000000000000000000011111111111100000000000000,
0b0000000000000000000111000000000000000000111111111100000000000000,
0b0000000000000000000111111110000000000000001111111000000000000000,
0b0000000000000000000111111111111100000000000011111000000000000000,
0b0000000000000000000111111111111111110000000000110000000000000000,
0b0000000000000000001111111111111111111111100000000000000000000000,
0b0000000000000000001111111111111111111111111111000000000000000000,
0b0000000000000000000000011111111111111111111111100000000000000000,
0b0000001111110000000000000001111111111111111111100000111111000000,
0b0000001111110000000000000000000011111111111111100000111111000000,
0b0000001111110000000000000000000000000111111111100000111111000000,
0b0000001111110000000000000000000000000000001111000000111111000000,
0b0000001111110000000000000000000000000000000000000000111111000000,
0b0000001111110000000000000000000000000000000000000000111111000000,
0b0000001111110000001111111111111111111111111111000000111111000000,
0b0000001111110000001111111111111111111111111111000000111111000000,
0b0000001111110000001111111111111111111111111111000000111111000000,
0b0000001111110000001111111111111111111111111111000000111111000000,
0b0000001111110000001111111111111111111111111111000000111111000000,
0b0000001111110000001111111111111111111111111111000000111111000000,
0b0000001111110000000000000000000000000000000000000000111111000000,
0b0000001111110000000000000000000000000000000000000000111111000000,
0b0000001111110000000000000000000000000000000000000000111111000000,
0b0000001111110000000000000000000000000000000000000000111111000000,
0b0000001111110000000000000000000000000000000000000000111111000000,
0b0000001111111111111111111111111111111111111111111111111111000000,
0b0000001111111111111111111111111111111111111111111111111111000000,
0b0000001111111111111111111111111111111111111111111111111111000000,
0b0000001111111111111111111111111111111111111111111111111111000000,
0b0000001111111111111111111111111111111111111111111111111111000000,
0b0000001111111111111111111111111111111111111111111111111111000000,
};

我们接着调用transpose32函数并打印输出结果的比特模式:
#include <stdio.h>

void
printbits(uint64_t a[64]) {
  int i, j;

  for (i = 0; i < 64; i++) {
    for (j = 63; j >= 0; j--)
      printf("%c", (a[i] >> j) & 1 ? '1' : '0');
    printf("\n");
  }
}

int
main() {
  transpose64(logo);
  printbits(logo);
  return 0;
}

然后,这将作为输出结果:
0000000000000000000000000000000000000000000000000000000000000000
0000000000000000000000000000000000000000000000000000000000000000
0000000000000000000000000000000000000000000000000000000000000000
0000000000000000000000000000000000000000000000000000000000000000
0000000000000000000000000000000000000000000000000000000000000000
0000000000000000000000000000000000000000000000000000000000000000
0000000000000000000000000000000000000000011111111111111111111111
0000000000000000000000000000000000000000011111111111111111111111
0000000000000000000000000000000000000000011111111111111111111111
0000000000000000000000000000000000000000011111111111111111111111
0000000000000000000000000000000000000000011111111111111111111111
0000000000000000000000000000000000000000011111111111111111111111
0000000000000000000000000000000000000000000000000000000000111111
0000000000000000000000000000000000000000000000000000000000111111
0000000000000000000000000000000000000000000000000000000000111111
0000000000000000000000000000000000000000000000000000000000111111
0000000000000000000000000000000000000000000000000000000000111111
0000000000000000000000000000000000000000000000000000000000111111
0000000000000000000000000000000000000011000000011111100000111111
0000000000000000000000000000000000111111000000011111100000111111
0000000000000000000000000000000000111111000000011111100000111111
0000000000000000000000000000000000111111000000011111100000111111
0000000000000000000000000100000000011111000000011111100000111111
0000000000000000000000011110000000011111100000011111100000111111
0000000000000000000001111110000000011111100000011111100000111111
0000000000000000000001111111000000011111100000011111100000111111
0000000000000000000000111111000000011111100000011111100000111111
0000000000000000000000111111100000001111110000011111100000111111
0000000000000000000000011111100000001111110000011111100000111111
0000000000000100000000011111110000001111110000011111100000111111
0000000000001110000000001111110000001111110000011111100000111111
0000000000011110000000001111111000001111110000011111100000111111
0000000001111111000000000111111000000111111000011111100000111111
0000000000111111100000000111111100000111111000011111100000111111
0000000000111111110000000011111100000111111000011111100000111111
0000000000011111111000000011111100000111111000011111100000111111
0000000000001111111100000001111110000011111000011111100000111111
0000000000000111111100000001111110000011111100011111100000111111
0000000000000011111110000000111111000011111100011111100000111111
0001000000000001111111000000111111000011111100011111100000111111
0011110000000001111111100000111111100011111100011111100000111111
0111111000000000111111110000011111100001111100011111100000111111
0111111110000000011111111000011111110001111110011111100000111111
1111111111000000001111111000001111110001111110011111100000111111
0011111111100000000111111100001111111001111110011111100000111111
0001111111111000000011111110000111111001111110011111100000111111
0000111111111100000011111111000111111100111100000000000000111111
0000001111111110000001111111100011111100000000000000000000111111
0000000111111111100000111111110011111000000000000000000000111111
0000000011111111110000011111110001100000000000000000000000111111
0000000000111111111000001111111000000000000000000000000000111111
0000000000011111111110000111111000000000000000000000000000111111
0000000000001111111111000111110000000000011111111111111111111111
0000000000000011111111100011100000000000011111111111111111111111
0000000000000001111111111001000000000000011111111111111111111111
0000000000000000111111111100000000000000011111111111111111111111
0000000000000000001111111100000000000000011111111111111111111111
0000000000000000000111111000000000000000011111111111111111111111
0000000000000000000011110000000000000000000000000000000000000000
0000000000000000000000100000000000000000000000000000000000000000
0000000000000000000000000000000000000000000000000000000000000000
0000000000000000000000000000000000000000000000000000000000000000
0000000000000000000000000000000000000000000000000000000000000000
0000000000000000000000000000000000000000000000000000000000000000

这个结果很不错,正如我们所希望的。

编辑:

实际上,这并不是你要求的非破坏性版本的代码。你可以通过将32x32块的第一个交换从xy来获得这个版本。例如,你可以做以下操作:

void
non_destructive_transpose64(uint64_t x[64], uint64_t y[64]) {
  int j, k;
  uint64_t m, t;

  for (k = 0; k < 64; k += 2) {
    ((uint32_t *) y)[k] = ((uint32_t *) x)[k ^ 64 + 1];
    ((uint32_t *) y)[k + 1] = ((uint32_t *) x)[k + 1];
  }
  for (; k < 128; k += 2) {
    ((uint32_t *) y)[k] = ((uint32_t *) x)[k];
    ((uint32_t *) y)[k + 1] = ((uint32_t *) x)[k ^ 64];
  }
  for (j = 16, m = 0x0000FFFF0000FFFF; j; j >>= 1, m ^= m << j) {
    for (k = 0; k < 64; k = ((k | j) + 1) & ~j) {
      t = (y[k] ^ (y[k | j] >> j)) & m;
      y[k] ^= t;
      y[k | j] ^= (t << j);
    }
  }
}

与其他版本不同,这个版本的代码不管架构的字节序如何都无法工作。此外,我知道C标准不允许您将uint64_t数组作为uint32_t数组访问。然而,如果你像这样做,第一次移动块的循环中不需要移位或异或操作,我会喜欢它。

1
@ziggystar:你给了我一个负评吗?无论如何,这个源代码链接包含了你所要求的实现。由于这段代码不是我写的,所以我不想也把它放在这里。 - Freek Wiedijk
1
@Andreas:你给我投了反对票吗?无论如何,我是新来的,不知道怎么做,请指点一下相关文档。而且这并不是_真正_的重复问题,因为另一个问题是关于8x8的,而这个问题是关于64x64的。 - Freek Wiedijk
2
“你是不是给了我踩票?”其实不是你被踩了,而是回答被踩了。我认为这是一个重要的区别。 - alk
1
我没有给负面评分。 事实上,8x8 矩阵并不容易推广。 - ziggystar
@ziggystar:我现在添加了一些源代码,还有一些解释(因为你无法从书中阅读这些页面)。这样更好吗? - Freek Wiedijk
显示剩余7条评论

0

在C++中,8x8的矩阵看起来就是这样,但你可以很容易地将其更改为更通用的形式(不仅限于8x8)。 另外,我已经包含了一个主函数和1个测试向量,只是为了让你有一个感觉:

#include <iostream>
#include <string>
#include <vector>

std::vector<long> rotate(std::vector<long>& v) {
    std::vector<long> temp = { 0,0,0,0,0,0,0,0 };
    for (unsigned int i = 0; i<8; i++) {
        int number = v[i];
        for (unsigned int j = 0; j<8; j++) {
            int z = (number & (1 << (7-j)));
            if (z != 0) {
                temp[j] |= (1 << (7 - i));
            }
        }
    }
    return temp;
}



int main()
{
    std::vector<long> v = { 0, 1, 2, 3, 4, 5, 6, 7 };
    std::vector<long> rotated = rotate(v);
    for (unsigned int i = 0; i<8; i++) {
        std::cout << rotated.at(i) << " ";
    }
    return 0;
}

所以,如果你需要在Java中使用它,你可以轻松地进行翻译,因为Java也提供了位运算符。

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接