稀疏哈希表背后的主要实现思想是什么？

Question

稀疏哈希表背后的主要实现思想是什么？

data-structureshashhashtablesparsehash

23

为什么Google的sparsehash开源库有两个实现：一个是密集哈希表，另一个是稀疏哈希表？

- Denis Gorodetskiy

我觉得我误解了帖子中的问题。稀疏哈希表+密集哈希表不就等于所有哈希表吗？如果是这样，那么为什么库被称为“sparsehash”？ - cHao

3

顺便提一下：来自Google Code的文档。我将为您翻译该文档，但不会解释其内容。 - cHao

2个回答

3

稀疏哈希是一种内存高效的将键映射到值的方式（每个键1-2位）。布隆过滤器可以让你获得每个键更少的位数，但它们不会将值附加到键上，除了外部/可能内部，这只是略低于一位信息。

- Tobu

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Fred Foo · Accepted Answer

密集哈希表是您普通的教科书哈希表实现。

稀疏哈希表仅存储实际设置的元素，分散在多个数组中。引用自稀疏表实现中的注释:

// The idea is that a table with (logically) t buckets is divided
// into t/M *groups* of M buckets each.  (M is a constant set in
// GROUP_SIZE for efficiency.)  Each group is stored sparsely.
// Thus, inserting into the table causes some array to grow, which is
// slow but still constant time.  Lookup involves doing a
// logical-position-to-sparse-position lookup, which is also slow but
// constant time.  The larger M is, the slower these operations are
// but the less overhead (slightly).

为了知道数组中哪些元素被设置了，稀疏表包括一个位图：

// To store the sparse array, we store a bitmap B, where B[i] = 1 iff
// bucket i is non-empty.  Then to look up bucket i we really look up
// array[# of 1s before i in B].  This is constant time for fixed M.

因此，每个元素仅产生1位的开销（在极限情况下）。