找出堆内存被破坏的位置

Question

找出堆内存被破坏的位置

9

我知道已经有很多类似的问题和答案存在，但我无法解决我的问题。

在我的大型应用程序中，堆栈的某个地方出现了损坏，我无法找到它。我使用了像gflags这样的工具，但没有运气。

我尝试在以下样本上使用gflags，该样本故意破坏了堆栈：

char* pBuffer = new char[256];
memset(pBuffer, 0, 256 + 1);
delete[] pBuffer;

在第2行，堆被覆盖了，但如何通过gflags、windbg等工具找到它。也许我没有正确使用gflags。

- Anil8753

3

在使用 memset 函数时，为什么要写成 256 + 1，即使你只分配了 256 字节的空间？ - T.Z

3

在您的大型应用程序中，您是如何知道堆被破坏的？是哪个工具告知您的？ - PaulMcKenzie

3

为了展示可能发生的腐败情况，@T.Z.提供了以下例子... - StoryTeller - Unslander Monica

1

你能否尝试使用gflags的替代工具？这里有一个潜在的工具列表，可用于查找Windows上的内存损坏。 - YSC

3

请注意，你的样本不一定总是会破坏堆，因为下一个堆分配头可能会远离已分配的内存。一种更有趣的策略是分配数据，并从缓冲区的开头向减小的内存地址写入。 - SirDarius

显示剩余9条评论

4个回答

1

如果同一个变量一直被损坏，数据断点是查找导致更改的代码的快速简便方法（如果您的IDE支持它们）。（在MS Visual Studio 2008中调试->新断点->新数据断点...）。如果您的堆损坏更随机，则无法帮助（但我想分享简单的答案以防有所帮助）。

- Robert

0

有一个叫做“电子围栏”的工具，我认为在 Windows 上也支持。

基本上它的作用是劫持 malloc 等函数，使得每次内存分配都以页边界结束，并将下一页标记为不可访问。

效果就是当缓冲区溢出时会导致段错误。

它可能还有一个选项来处理缓冲区低位溢出。

- BitWhistler

0

请阅读以下链接： Visual Studio - how to find source of heap corruption errors Is there a good Valgrind substitute for Windows? 它介绍了在Windows上查找堆问题的技术。

但是，另一方面，如果您正在编写新代码，您总是可以编写内存管理器。做法是：使用包装器API调用malloc / calloc等。

假设您有api myMalloc（size_t len）; 然后在函数内部，您可以尝试分配HEADER + len + FOOTER。在标题中保存信息，例如分配的大小或更多信息。在页脚处添加一些魔术数字，例如deadbeef。并从myMalloc返回ptr（来自malloc）+ HEADER。

使用myfree（void * ptr）释放它时，只需执行ptr-HEADER，检查len，然后跳转到FOOTER = ptr-HEADER +真正的分配长度。在此偏移量处，您应该找到deadbeef，如果找不到，则知道已被破坏。

- Ritesh

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Jeremy Friesner · Accepted Answer

如果自动化工具（如电子围栏或valgrind）无法解决问题，并且仔细查看代码以尝试确定可能出错的位置也没有帮助，而禁用/启用各种操作（直到您在先前执行或未执行的操作与堆损坏存在相关性）以缩小范围也不起作用，则可以尝试这种技术，它试图尽早发现损坏，以便更容易追踪源头：

创建自己的自定义 new 和 delete 运算符，将损坏明显的警戒区域放置在分配的内存区域周围，类似于以下内容：

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <new>

// make this however big you feel is "big enough" so that corrupted bytes will be seen in the guard bands
static int GUARD_BAND_SIZE_BYTES = 64;

static void * MyCustomAlloc(size_t userNumBytes)
{
    // We'll allocate space for a guard-band, then space to store the user's allocation-size-value,
    // then space for the user's actual data bytes, then finally space for a second guard-band at the end.
    char * buf = (char *) malloc(GUARD_BAND_SIZE_BYTES+sizeof(userNumBytes)+userNumBytes+GUARD_BAND_SIZE_BYTES);
    if (buf)
    {
       char * w = buf;
       memset(w, 'B', GUARD_BAND_SIZE_BYTES);          w += GUARD_BAND_SIZE_BYTES;
       memcpy(w, &userNumBytes, sizeof(userNumBytes)); w += sizeof(userNumBytes);
       char * userRetVal = w;                          w += userNumBytes;
       memset(w, 'E', GUARD_BAND_SIZE_BYTES);          w += GUARD_BAND_SIZE_BYTES;
       return userRetVal;
    }
    else throw std::bad_alloc();
}

static void MyCustomDelete(void * p)
{
    if (p == NULL) return;   // since delete NULL is a safe no-op

    // Convert the user's pointer back to a pointer to the top of our header bytes
    char * internalCP = ((char *) p)-(GUARD_BAND_SIZE_BYTES+sizeof(size_t));

    char * cp = internalCP;
    for (int i=0; i<GUARD_BAND_SIZE_BYTES; i++)
    {
        if (*cp++ != 'B')
        {
            printf("CORRUPTION DETECTED at BEGIN GUARD BAND POSITION %i of allocation %p\n", i, p);
            abort();
        }
    }

    // At this point, (cp) should be pointing to the stored (userNumBytes) field
    size_t userNumBytes = *((const size_t *)cp);
    cp += sizeof(userNumBytes);  // skip past the user's data
    cp += userNumBytes;

    // At this point, (cp) should be pointing to the second guard band
    for (int i=0; i<GUARD_BAND_SIZE_BYTES; i++)
    {
        if (*cp++ != 'E')
        {
            printf("CORRUPTION DETECTED at END GUARD BAND POSITION %i of allocation %p\n", i, p);
            abort();
        }
    }

    // If we got here, no corruption was detected, so free the memory and carry on
    free(internalCP);
}

// override the global C++ new/delete operators to call our
// instrumented functions rather than their normal behavior
void * operator new(size_t s)    throw(std::bad_alloc)   {return MyCustomAlloc(s);}
void * operator new[](size_t s)  throw(std::bad_alloc)   {return MyCustomAlloc(s);}
void operator delete(void * p)   throw()                 {MyCustomDelete(p);}
void operator delete[](void * p) throw()                 {MyCustomDelete(p);}

以上方法足以实现类似Electric-Fence的功能，即如果有任何东西向新/删除内存分配的开头或末尾的两个64字节的“警戒带”中写入，那么当删除该分配时，MyCustomDelete()将注意到损坏并使程序崩溃。

如果这还不够好（例如，由于在删除发生时，自损坏之后已经发生了许多事情，很难确定是什么导致了自损坏），则可以通过使MyCustomAlloc()将分配的缓冲区添加到全局双向链表中，并使MyCustomDelete()从同一列表中删除它来更进一步（如果您的程序是多线程的，请确保串行化这些操作！）。这样做的优点是，您可以添加另一个名为CheckForHeapCorruption()的函数，该函数将遍历该链表并检查链接列表中每个分配的警戒带，如果其中任何一个分配已被破坏，则报告。然后，您可以在代码中分散呼叫CheckForHeapCorruption()，因此当堆损坏发生时，它将在下一次调用CheckForHeapCorruption()时被检测到，而不是在以后的某个时间点上。最终，您会发现，一个CheckForHeapCorruption()的呼叫通过了测试，接着就是下一个CheckForHeapCorruption()的呼叫，仅相隔几行代码，检测到了损坏，在这一点上，您就知道是哪段代码导致了损坏，并且可以研究特定的代码以找出它做错了什么，或者根据需要向该代码添加更多的CheckForHeapCorruption()呼叫。

重复以上步骤，直至bug清晰明了。祝你好运！