由于我正准备在这个相关问题是否有一种非迭代的方法来找到第N个设置位的索引?被关闭之前发布一个相当便携的答案,我想我会在这里发布它,以防其他人觉得它有趣或有用。我发现在热循环中,与链接问题中显示的迭代位移算法相比,它的平均速度要快大约25%,前提是使用-msse4编译。它使用了一个查找表,所以有关热缓存的旧论点也是存在的。
uint64_t pos_of_nth_bit2(uint64_t X, uint64_t bit) {
int32_t testx, pos, pop;
int8_t lut[4][16] = {{0,0,1,0,2,0,1,0,3,0,1,0,2,0,1,0},
{0,0,0,1,0,2,2,1,0,3,3,1,3,2,2,1},
{0,0,0,0,0,0,0,2,0,0,0,3,0,3,3,2},
{0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3}};
_Bool test;
pos = 0;
pop = __builtin_popcount(X & 0xffffffffUL);
test = pop <= bit;
bit -= test*pop;
testx = test*32;
X >>= testx;
pos += testx;
pop = __builtin_popcount(X & 0xffffUL);
test = pop <= bit;
bit -= test*pop;
testx = test*16;
X >>= testx;
pos += testx;
pop = __builtin_popcount(X & 0xffUL);
test = pop <= bit;
bit -= test*pop;
testx = test*8;
X >>= testx;
pos += testx;
pop = __builtin_popcount(X & 0xfUL);
test = pop <= bit;
bit -= test*pop;
testx = test*4;
X >>= testx;
pos += testx;
return pos + lut[bit][X & 0xf];
}