使用C++标准库在对数时间内进行堆化

Question

使用C++标准库在对数时间内进行堆化

7

我有一个使用std::make_heap的堆：

std::vector<int> v{1,2,3,5,9,20,3};
std::make_heap(v.begin(), v.end());

现在，我通过更改一个随机元素来更新堆：

v[3] = 35;

有没有一种在标准库中调整堆的方法，使得时间复杂度为 O(log n)（其中 n 是容器的大小）？基本上我正在寻找堆化函数。我知道哪个元素已经被更改了。

我知道 std::make_heap 的时间复杂度是 O(n log n)。我也看过类似的问题，但那个问题是在更改最大元素，对于那个问题，已经给出了时间复杂度为 O(log n) 的解决方案。

我正在尝试更改堆中的任何随机元素。

- code707

1

如果您打算对堆进行频繁修改，我可能会使用std::priority_queue。 - erip

可能是Heapify in C++ STL?的重复问题。 - Shubham

1

make_heap 的复杂度仅为线性，因此只需再次调用它。 - 463035818_is_not_a_number

4个回答

5

您可以自己完成这项任务：

void modify_heap_element(std::vector<int> &heap, size_t index, int value)
{
    //while value is too large for its position, bubble up
    while(index > 0 && heap[(index-1)>>1] < value)
    {
        size_t parent = (index-1)>>1;
        heap[index]=heap[parent];
        index = parent;
    }
    //while value is too large for its position sift down
    for (;;)
    {
        size_t left=index*2+1;
        size_t right=left+1;
        if (left >= heap.size())
            break;
        size_t bigchild = (right >= heap.size() || heap[right] < heap[left] ?
                           left : right );
        if (!(value < heap[bigchild]))
           break;
        heap[index]=heap[bigchild];
        index = bigchild;
    }
    heap[index] = value;
}

- Matt Timmermans

1

我可以有效地假设标准算法处理方式如下所示，我也可以手动修改。我认为可以请求堆化的标准算法。 - code707

2

@code707 这个函数不在标准库里，因为并没有太多使用场景。我不知道你在做什么，但如果你需要一个拥有可修改元素的堆来进行 Dijkstra 算法等操作，那么这个函数可能就不太适合了。为了找到需要修改的元素，通常需要从值到其在堆中位置的反向指针。在这种情况下，你不能使用 STL 的任何函数，上述函数需要被修改以维护这些反向指针，同时将元素移动到正确位置。 - Matt Timmermans

2

我一直在面对想要一个“可更新堆”的问题。然而，最终我并没有编写自定义的可更新堆或类似的代码，而是以稍微不同的方式解决了这个问题。

为了保持对最佳元素的访问，而无需显式地遍历整个堆，您可以使用需要排序的元素的版本化包装器。每个唯一的真实元素都有一个版本计数器，每次元素更改时都会增加。然后，堆内的每个包装器都携带元素的一个版本，即包装器创建时的版本。

struct HeapElemWrapper
{
    HeapElem * e;

    size_t version;        
    double priority;

    HeapElemWrapper(HeapElem * elem)
     : e(elem), version(elem->currentVersion), priority(0.0)
    {}

    bool upToDate() const
    {
        return version == e->currentVersion;
    }

    // operator for ordering with heap / priority queue:
    // smaller error -> higher priority
    bool operator<(const HeapElemWrapper & other) const
    {
        return this->priority> other.priority;
    }
};

从堆中弹出顶部元素时，您可以检查该包装器元素是否与原始元素保持最新状态。如果不是，请将其丢弃并弹出下一个元素。这种方法非常高效，我在其他应用程序中也看到过它的运用。唯一需要注意的是，您需要定期（比如每1000次插入）遍历堆以清除过时的元素。

- volzotan

如果更新的元素是最上面的元素，它将如何移动到顶部？ - code707

不确定我是否理解了问题 - 如果元素已经在顶部，那么问题是什么？无论如何，想法是您将使用最新版本的元素将副本插入堆中的适当位置，该新堆元素将自动冒泡到正确的位置，而无需由于中间某个修改的元素而重新构造堆。 - volzotan

2

使用标准库提供的函数模板std::pop_heap()和std::push_heap(),无法在对数时间内修改堆的任意元素而不违反堆属性。

但是，您可以定义自己类似于STL的函数模板set_heap_element()来实现此目的:

template<typename RandomIt, typename T, typename Cmp>
void set_heap_element(RandomIt first, RandomIt last, RandomIt pos, T value, Cmp cmp)
{
    const auto n = last - first;
    *pos = std::move(value); // replace previous value

    auto i = pos - first;
    using std::swap;

    // percolate up
    while (i > 0) { // non-root node
        auto parent_it = first + (i-1)/2;

        if (cmp(*pos, *parent_it))
            break; // parent node satisfies the heap-property 

        swap(*pos, *parent_it); // swap with parent
        pos = parent_it;
        i = pos - first;
    }

    // percolate down
    while (2*i + 1 < n) { // non-leaf node, since it has a left child
        const auto lidx = 2*i + 1, ridx = 2*i + 2;

        auto lchild_it = first + lidx; 
        auto rchild_it = ridx < n? first + ridx: last;

        auto it = pos;
        if (cmp(*it, *lchild_it))
            it = lchild_it;
        if (rchild_it != last && cmp(*it, *rchild_it))
            it = rchild_it;

        if (pos == it)
            break; // node satisfies the heap-property

        swap(*pos, *it); // swap with child
        pos = it;
        i = pos - first;
    }
}

接下来，你可以为最大堆提供以下简化的set_heap_element()重载：

#include <functional> // std::less

template<typename RandomIt, typename T>
void set_heap_element(RandomIt first, RandomIt last, RandomIt pos, T value) {
    return set_heap_element(first, last, pos, value, std::less<T>{});
}

这个重载使用一个std::less<T>对象作为原始函数模板的比较函数对象。

示例

在您的最大堆示例中，set_heap_element()可按以下方式使用：

std::vector<int> v{1,2,3,5,9,20,3};
std::make_heap(v.begin(), v.end());

// set 4th element to 35 in O(log n)
set_heap_element(v.begin(), v.end(), v.begin() + 3, 35);

您可以使用std::is_heap()来检查在使用上述set_heap_element()函数模板设置元素后，v是否仍满足最大堆属性。该函数具有线性时间复杂度。

assert(std::is_heap(v.begin(), v.end()));

最小堆是怎么样的？

对于最小堆，你可以通过将std::greater<int>对象作为函数调用std::make_heap()、set_heap_element()和std::is_heap()的最后一个参数来实现：

std::vector<int> v{1,2,3,5,9,20,3};
// create a min heap
std::make_heap(v.begin(), v.end(), std::greater<int>{});

// set 4th element to 35 in O(log n)
set_heap_element(v.begin(), v.end(), v.begin() + 3, 35, std::greater<int>{});

// is the min-heap property satisfied?
assert(std::is_heap(v.begin(), v.end(), std::greater<int>{}));

- JFMR

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- ローウ · Accepted Answer

如果我们仔细看一下您的陈述：

现在，我通过更改堆中的一个随机元素来干扰堆。

要以O(log n)的时间进行堆化，您只能直接“干扰”向量的后面或前面（这在某种程度上对应于插入或删除元素）。在这些情况下，可以通过std::push_heap和std::pop_heap算法实现（重新）堆化，其运行时间为对数级别。

也就是说，后面：

v.back() = 35;
std::push_heap(v.begin(), v.end()); // heapify in O(log n)

或者前端：

v.front() = 35;

// places the front at the back
std::pop_heap(v.begin(), v.end()); // O(log n)
// v.back() is now 35, but it does not belong to the heap anymore

// make the back belong to the heap again
std::push_heap(v.begin(), v.end()); // O(log n)

否则，您需要使用 std::make_heap 对整个向量进行重新堆化，这需要线性运行时间。

摘要

使用标准库（即函数模板 std::push_heap 和 std::pop_heap）无法修改堆中的任意元素并在对数时间内实现堆化。但是，您可以自己实现堆的上浮和下沉操作，以便在对数时间内进行堆化。