我是否正确实现了“堆化”算法?

18

我正在为计算机科学课程创建一个堆实现,我想知道以下递归函数是否可以将非堆数组对象转换为堆。

代码如下:

void Heap::Heapify(int i)
{
    int temp, l, r, heapify;
    l = LeftChild(i);// get the left child
    r = RightChild(i);// get the right child

    //if one of the children is bigger than the index
    if((Data[i] < Data[l]) || (Data[i]< Data[r]))
    {
        //if left is the bigger child
        if(Data[l] > Data[r])
        {
            //swap parent with left child
            temp = Data[i];
            Data[i] = Data[l];
            Data[l] = temp;
            heapify = l; // index that was swapped
        }
        //if right is the bigger child
        else
        {
            //swap parent with right child
            temp = Data[i];
            Data[i] = Data[r];
            Data[r] = temp;
            heapify = r; // index that was swapped
        }
        // do a recursive call with the index
        //that was swapped
        Heapify(heapify);
    }
}

这个思路是检查给定索引的数据是否比它的所有子节点都大。如果是,函数直接结束。否则,它会检查哪个子节点最大(左边的还是右边的),并将其与该索引交换。然后在发生交换的索引处调用堆化函数。

根据ildjarn的请求,我包含了完整的类定义和实现文件以帮助回答我的问题:
以下是头文件:

#ifndef HEAP_H
#define HEAP_H
//Programmer: Christopher De Bow
//Date: november 15, 2011

class Heap
{ 
private:
    int Data [100];
    int Parent(int);
    int RightChild(int);
    int LeftChild(int);
    void Heapify(int);
    void BuildHeap();

public:
    Heap();
    void insert();
    void HeapSort();
    void ExtractMaximum();
    int Maximum();
    void PrintHeap();
    int heapsize;
    void SetData(int[]);
};

#endif

还有实现文件:

#include <iostream>
#include "Heap.h"
using namespace std;
//Programmer: Christopher De Bow
//Date: november 15, 2011

Heap::Heap()
{
    int init [10] = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10};
    heapsize = 10;
    SetData(init);
}

int Heap::Parent(int index)
{
    int Rval;
    if(index%2 == 0)// if the index is even
    {
        Rval = ((index-1)/2);
    }
    else// if the index is odd
    {
        Rval = (index/2);
    }
    return Rval;
}

int Heap::RightChild(int arrplace)
{
    int ret;
    ret = ((2*arrplace)+2); //rightchild is index times 2 plus 2
    return ret;
}

int Heap::LeftChild(int i)
{
    int rval;
    rval = ((2*i)+1); //leftchild is index times 2 plus 1
    return rval;
}

void Heap::Heapify(int i)
{
    int temp, l, r, heapify;

    l = LeftChild(i); // get the left child
    r = RightChild(i); // get the right child

    if((l <= heapSize) && (data[l] > data[i]))
    {
        heapify = l;
    {
    else
    {
        heapfiy = i;
    }
    if((r <= heapSize) && (data[r] > data[heapify]))
    {
        heapify = r;
    }
    if(heapify != i) // one of the two child nodes has proved
    {                // larger than Data[i], so interchange values
        //swap parent with left child
        temp = Data[i];
        Data[i] = Data[heapify];
        Data[heapify] = temp;
        Heapify(heapify);
    }
}

void Heap::BuildHeap()
{
    // we do not have a heap
    // we will make a heap
    // by calling heapify starting at the lowest
    // internal node in the heap
    for(int i = heapsize; i >= 1; i--)
    {
        Heapify(i-1);
    }
}

void Heap::insert()
{
    int insert;
    heapsize = (heapsize + 1);
    //getting data from the user
    cout<<"what data would you like to insert?"<<endl;
    cin>>insert;
    Data[heapsize] = insert;
    BuildHeap(); //call BuildHeap on array
    cout<<"done"<<endl;
}

void Heap::PrintHeap()
{
    BuildHeap();
    for(int count = 0; count < (heapsize-1); count++)
    {
        cout<<Data[count];// print out every element in heap
    }
    cout<<endl<<endl;
}

void Heap::HeapSort()
{
    BuildHeap();
    int temp;
    // do this for every elem in heap:
    for(int i = 0; i < heapsize; i++)
    {
        temp = Data[heapsize-1];
        Data[heapsize-1] = Data[0];
        Data[0] = temp;
        heapsize--;
        BuildHeap();
    }
    PrintHeap();
}

void Heap::ExtractMaximum()
{
    BuildHeap();
    //assign last thing in heap to first thing in heap
    Data[0] = Data[heapsize];
    heapsize --; // decrease heapsize by one
    Heapify(0); // heapify from the top
}

int Heap::Maximum()
{
    int Rval;
    BuildHeap();// make sure we have a heap
    Rval = Data[0];
    return Rval; // return top thing
}

//initialize the elements in the "Data" array
void Heap::SetData(int x[])
{
    for(int i = 0; i <= (heapsize); i++)
    {
        Data[i] = x[i]; 
    }
}

20
耶!一道有努力证明的作业问题!+1 - vcsjones
@vcsjones:确实,可悲的是这种情况很少见。 - ildjarn
4个回答

9
您的算法有效。问题在于将算法翻译成代码。假设您声明了Data如下:
int Data[7];

您需要将其初始化为{0, 1, 2, 3, 4, 5, 6}等初始值。假设LeftChild(i)RightChild(i)的定义如下:

#define LeftChild(i) ((i << 1) + 1)
#define RightChild(i) ((i << 1) + 2)

接下来是您的函数BuildHeap(),应该是以下内容:

void Heap::BuildHeap()
{
    for(int i = (7 >> 1); i >= 1; i--) // in general, replace 7 with 
                                       // (sizeof(Data)/sizeof(int)), presuming 
                                       // you have an array of int's. if not, 
                                       // replace int with the relevant data type
    Heapify(i-1);
}

将在右下角子树的根节点开启Heapify进程。在本例中,这是数组索引2,其左子节点为5,右子节点为6。Heapify将正确交换2和6,并递归调用Heapify(6)

这里整个过程可能会停滞不前!目前你的树看起来像:

                         0
                    1         2
                 3     4   5     6
            u n d e f i n e d  s p a c e

因此,调用Heapify(6)将会比较Data[6]的值与Data[13]Data[14]的值(这是C++和其缺乏数组边界强制执行的危险,不像Java)。显然,后两个值可以是存储在RAM中的任何垃圾数据。一个丑陋但可行的解决方案是在声明Data时添加8个元素,并将它们全部初始化为低于数组中任何元素的某个值。更好的解决方案是向你的类添加一个heapSize变量,并将其设置为数组的长度:

heapSize = (sizeof(Data)/sizeof(int));

然后将逻辑集成到仅比较树的有效叶子节点。这个实现方式很高效:

void Heap::Heapify(int i)
{
    int temp, l, r, heapify;

    l = LeftChild(i); // get the left child
    r = RightChild(i); // get the right child

    if((l <= heapSize) && (Data[l] > Data[i]))
        heapify = l;
    else heapfiy = i;
    if((r <= heapSize) && (Data[r] > Data[heapify]))
        heapify = r;
    if(heapify != i) // one of the two child nodes has proved 
                     // larger than Data[i], so interchange values
    {
        //swap parent with left child
        temp = Data[i];
        Data[i] = Data[heapify];
        Data[heapify] = temp;

        Heapify(heapify);
    }
}

因此,简而言之,解决方案就是添加逻辑以确保子节点是树的有效叶子,并且您的主函数将类似于以下内容:
Heap heap;

// initialize Data here    

heap.BuildHeap();

希望这对您有所帮助。

8

不在树上。

      1
     / \
    /   \
   /     \
  2       3
 / \     / \
6   7   4   5

输出结果将是:
      3
     / \
    /   \
   /     \
  2       5
 / \     / \
6   7   4   1

该代码存在几个堆违规情况。(我假设如果相应子节点不存在,则Data[l]Data[r]等于负无穷。您可能需要额外的逻辑来确保这一点。)

您的函数做的是修复一棵可能不是堆的树,但其左右子树都是堆。您需要在每个节点上调用它,按后序遍历的方式(即从n-1到0),这样当Heapify(i)被调用时,i的子节点就是堆。


1
你不必为叶节点调用它。 - Karoly Horvath

4

您的代码现在成功构建了堆。只有一个概念上的缺陷:其他错误是偏移量错误。其中一个根本性错误出在BuildHeap中:您有

for(int i = heapSize; i >= 1; i--)
{
    Heapify(i-1);
}

应该是这样的

for(int i = (heapSize / 2); i >= 1; i--)
{
    Heapify(i-1);
}

这非常重要,您必须看到Heapify总是在树根上调用的,并且(这真的很酷)您可以轻松地在数组中找到最后一个树根,其索引为((heapSize/2)-1)(这适用于C++和Java样式,其中第一个索引==0)。按照原来的方式,您的代码在树的最后一个叶子上调用了Heapify,这是错误的。
除此之外,我添加了注释以标记偏移一个的错误。我将它们放在左侧以便您轻松找到它们。希望您对算法和数据结构有超级理解! :-)
您的头文件:
#ifndef HEAP_H
#define HEAP_H
//Programmer: Christopher De Bow
//Date: november 15, 2011

class Heap
{
private:
    int Data [100];
    int Parent(int);
    int RightChild(int);
    int LeftChild(int);
    void Heapify(int);
    void BuildHeap();
// SO added heapSize
    int heapSize;

public:
    Heap();
    void insert();
    void HeapSort();
    void ExtractMaximum();
    int Maximum();
    void PrintHeap();
    int heapsize;
    void SetData(int[]);
};

#endif

您的cpp文件:

#include <iostream>
#include "Heap.h"
using namespace std;
//Programmer: Christopher De Bow
//Date: november 15, 2011

Heap::Heap()
{
    int init [10] = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10};
    heapSize = 10;
    SetData(init);
}

int Heap::Parent(int index)
{
    int Rval;
    if(index%2 == 0)// if the index is even
    {
        Rval = ((index-1)/2);
    }
    else// if the index is odd
    {
        Rval = (index/2);
    }
    return Rval;
}

int Heap::RightChild(int arrplace)
{
    int ret;
    ret = ((2*arrplace)+2); //rightchild is index times 2 plus 2
    return ret;
}

int Heap::LeftChild(int i)
{
    int rval;
    rval = ((2*i)+1); //leftchild is index times 2 plus 1
    return rval;
}

void Heap::Heapify(int i)
{
    int temp, l, r, heapify;

    l = LeftChild(i); // get the left child
    r = RightChild(i); // get the right child

// you have to compare the index to (heapSize - 1) because we are working
// with C++ and the first array index is 0 : l and r are direct indices
// into the array, so the maximum possible index is the heapSize'th 
// element, which is at heapSize-1. this was kind of nasty as it let the 
// heapify index get too large and led to a swap with memory beyond the
// last element of the array (again, C++ doesn't enforce array boundaries
// as Java does).
    if((l <= (heapSize-1)) && (Data[l] > Data[i]))
        heapify = l;
    else
        heapify = i;
// you have to compare the index to (heapSize - 1) because we are working
// with C++ and the first array index is 0 : l and r are direct indices
// into the array, so the maximum possible index is the heapSize'th 
// element, which is at heapSize-1. this was kind of nasty as it let the 
// heapify index get too large and led to a swap with memory beyond the
// last element of the array (again, C++ doesn't enforce array boundaries
// as Java does).
    if((r <= (heapSize-1)) && (Data[r] > Data[heapify]))
        heapify = r;
    if(heapify != i) // one of the two child nodes has proved
    {                // larger than Data[i], so interchange values
        //swap parent with left child
        temp = Data[i];
        Data[i] = Data[heapify];
        Data[heapify] = temp;
        Heapify(heapify);
    }
}

void Heap::BuildHeap()
{
    // we do not have a heap
    // we will make a heap
    // by calling heapify starting at the lowest
    // internal node in the heap
// i must be initialized to (heapsize/2), please see my 
// post for an explanation
    for(int i = heapSize/2; i >= 1; i--)
    {
        Heapify(i-1);
    }
}

void Heap::insert()
{
    int insert;
    heapSize = (heapSize + 1);
    //getting data from the user
    cout<<"what data would you like to insert?"<<endl;
    cin>>insert;
    Data[heapSize] = insert;
    BuildHeap(); //call BuildHeap on array
    cout<<"done"<<endl;
}

void Heap::PrintHeap()
{
    BuildHeap();
// the array indices are from 0 through (heapSize-1), so 
// count must be less than _or equal to_ (heapSize-1). another
// way of phrasing this (which i applied in this function)
// is (count < heapSize). you'll get better boundary conditions
// with practice.
    for(int count = 0; count < heapSize; count++)
    {
// added an endl to the output for clarity
        cout << Data[count] << endl;// print out every element in heap
    }
    cout<<endl<<endl;
}

void Heap::HeapSort()
{
    BuildHeap();
    int temp;
    // do this for every elem in heap:
    for(int i = 0; i < heapSize; i++)
    {
        temp = Data[heapSize-1];
        Data[heapSize-1] = Data[0];
        Data[0] = temp;
        heapSize--;
        BuildHeap();
    }
    PrintHeap();
}

void Heap::ExtractMaximum()
{
    BuildHeap();
    //assign last thing in heap to first thing in heap
    Data[0] = Data[heapSize];
    heapSize--; // decrease heapSize by one
    Heapify(0); // heapify from the top
}

int Heap::Maximum()
{
    int Rval;
    BuildHeap();// make sure we have a heap
    Rval = Data[0];
    return Rval; // return top thing
}

//initialize the elements in the "Data" array
void Heap::SetData(int x[])
{
// the array indices are from 0 through (heapSize-1), so 
// count must be less than _or equal to_ (heapSize-1). another
// way of phrasing this (which i applied in this function)
// is (i < heapSize). you'll get better boundary conditions
// with practice.
    for(int i = 0; i < heapSize; i++)
    {
        Data[i] = x[i];
    }
}

// basic confirmation function
int main()
{
    Heap heap;
    heap.PrintHeap();

    return 0;
}

1

你在这里编写的代码肯定感觉不错;但是没有什么比编写一些测试用例来看它的表现更好了。一定要对包含1、2、3、4和数十个元素的堆进行测试。(我预计基本情况会出现问题——当i没有子节点时,它如何处理?在小堆上进行测试应该很快就能发现。)

关于这段代码的一些小建议:

if(Data[l] > Data[r])
{
    //swap parent with left child
    temp = Data[i];
    Data[i] = Data[l];
    Data[l] = temp;
    heapify = l; // index that was swapped
}
//if right is the bigger child
else
    { //swap parent with right child
    temp = Data[i];
    Data[i] = Data[r];
    Data[r] = temp;
    heapify = r; // index that was swapped
}

通过仅在if块中设置index,您可能可以获得更好的可读性:

if(Data[l] > Data[r]) {
    swapme = l;
} else {
    swapme = r;
}

temp = Data[i];
Data[i] = Data[swapme];
Data[swapme] = temp;
heapify = swapme;

我已经运行了几次,但它不起作用。它对堆栈实际上什么也没做。然而,我正在使用一个不同的函数来调用它,但是所有该函数所做的就是调用最低的内部节点,然后从那里调用heapify。我明天会问我的教授,但我就是不理解 @_@ - Chris De Bow
1
@Chris:你能否更新一下你的问题,包括完整的类定义?问题可能出在其他地方,例如LeftChildRightChild的语义,或者Data声明的方式。 - ildjarn

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接