如何在C语言中实现位集？

Question

如何在C语言中实现位集？

13

我一直在使用Java中的Bitset类，现在我想在C语言中做类似的事情。我想大多数情况下都需要手动实现。那么，在C语言中实现一个高效的方式是什么呢？

byte bitset[]

也许

bool bitset[]

?

- David Robles

在内存或CPU方面高效？ - robert

@robert：我认为首先要考虑内存。这是因为可能会有很少的处理开销，但如果出现缓存未命中的情况，就会产生严重的开销。 - ruslik

@robert：有区别吗？如果有大量的位，性能将受到缓存未命中的限制，因此尽可能紧密地打包位将提供最佳性能。只有在位数非常少的情况下，才可能更有效地使用每个位的整个字节（或更多）。 - R.. GitHub STOP HELPING ICE

7个回答

16

没人提到C语言FAQ建议的东西，那就是一堆好用的老宏定义：

#include <limits.h>     /* for CHAR_BIT */

#define BITMASK(b) (1 << ((b) % CHAR_BIT))
#define BITSLOT(b) ((b) / CHAR_BIT)
#define BITSET(a, b) ((a)[BITSLOT(b)] |= BITMASK(b))
#define BITCLEAR(a, b) ((a)[BITSLOT(b)] &= ~BITMASK(b))
#define BITTEST(a, b) ((a)[BITSLOT(b)] & BITMASK(b))
#define BITNSLOTS(nb) ((nb + CHAR_BIT - 1) / CHAR_BIT)

（通过http://c-faq.com/misc/bitsets.html）

- user755921

1

但这并不总是能防止宏副作用，例如尝试： int i = 0, bits; BITSET(bits, i++) - Luke Smith

1

@LukeSmith 你说得有道理，但它似乎被广泛使用。实现宏的正确方式似乎是让调用者理解它是一个宏，从而将责任放在调用者身上。（任何不喜欢这种方式的人都可以将其包装在内联函数中） - Opux

3

嗯，byte bitset[] 看起来有点误导人，不是吗？

在结构体中使用位域，然后您就可以维护这些类型的集合（或根据需要以其他方式使用它们）。

struct packed_struct {
  unsigned int b1:1;
  unsigned int b2:1;
  unsigned int b3:1;
  unsigned int b4:1;
  /* etc. */
} packed;

- Ed S.

2

这不是一个坏主意，适用于一小组标志，但如果您使用位集，则通常希望可以通过整数进行索引。例如，请参见Java位集类。 - President James K. Polk

是的，我后来想到了这一点，然后注意到Mike发了类似的东西。 - Ed S.

2

位域的使用和将索引放入变量名中是低效的。 - R.. GitHub STOP HELPING ICE

2

我推荐我的BITSCAN C++库（刚发布了1.0版本）。BITSCAN专为快速的比特位扫描操作而设计。我使用它来实现NP-Hard组合问题，其中包括简单无向图的最大团（请参见BBMC算法，这是一个领先的精确求解器）。

这里提供了BITSCAN与标准解决方案STL bitset和BOOST dynamic_bitset之间的比较：http://blog.biicode.com/bitscan-efficiency-at-glance/

- chesslover

1

您可以尝试使用 bitsPerItem 为 1 的 PackedArray 代码。

它实现了一种随机访问容器，其中项目在位级上打包。换句话说，它的行为就像您能够操作例如 uint9_t 或 uint17_t 数组一样：

PackedArray principle:
  . compact storage of <= 32 bits items
  . items are tightly packed into a buffer of uint32_t integers

PackedArray requirements:
  . you must know in advance how many bits are needed to hold a single item
  . you must know in advance how many items you want to store
  . when packing, behavior is undefined if items have more than bitsPerItem bits

PackedArray general in memory representation:
  |-------------------------------------------------- - - -
  |       b0       |       b1       |       b2       |
  |-------------------------------------------------- - - -
  | i0 | i1 | i2 | i3 | i4 | i5 | i6 | i7 | i8 | i9 |
  |-------------------------------------------------- - - -

  . items are tightly packed together
  . several items end up inside the same buffer cell, e.g. i0, i1, i2
  . some items span two buffer cells, e.g. i3, i6

- Gregory Pakosz

0

通常情况下，您需要首先确定在位集上需要执行哪些操作。也许是Java定义的某个子集？之后，您可以决定如何最好地实现它。您可以查看OpenJDK中BitSet.java的源代码以获取灵感。

- President James K. Polk

-3

将其转换为无符号64位整数数组。

- EvilTeach

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Mike Axiak · Accepted Answer

CCAN提供了一个bitset的实现，你可以使用：http://ccan.ozlabs.org/info/jbitset.html

但是如果你最终决定自己实现（例如，如果你不喜欢该程序包上的依赖项），你应该使用int数组并使用计算机架构的本地大小：

#define WORD_BITS (8 * sizeof(unsigned int))

unsigned int * bitarray = (int *)calloc(size / 8 + 1, sizeof(unsigned int));

static inline void setIndex(unsigned int * bitarray, size_t idx) {
    bitarray[idx / WORD_BITS] |= (1 << (idx % WORD_BITS));
}

不要使用特定的大小（例如uint64或uint32），让计算机使用它想要使用的大小，并使用sizeof进行适应。