为什么host_statistics64()返回的结果不一致?

19

为什么在OS X 10.6.8中,host_statistics64()函数返回的自由、活动、非活动和有线内存计数不等于总内存量?而且为什么它缺少数量不一致的页面?

以下输出表示未被归类为自由、活动、非活动或有线内存的页面数量,在十秒钟内(大约每秒采样一次)。

458
243
153
199
357
140
304
93
181
224

产生以上数字的代码如下:

#include <stdio.h>
#include <mach/mach.h>
#include <mach/vm_statistics.h>
#include <sys/types.h>
#include <sys/sysctl.h>
#include <unistd.h>
#include <string.h>

int main(int argc, char** argv) {
        struct vm_statistics64 stats;
        mach_port_t host    = mach_host_self();
        natural_t   count   = HOST_VM_INFO64_COUNT;
        natural_t   missing = 0;
        int         debug   = argc == 2 ? !strcmp(argv[1], "-v") : 0;
        kern_return_t ret;
        int           mib[2];
        long          ram;
        natural_t     pages;
        size_t        length;
        int           i;

        mib[0] = CTL_HW;
        mib[1] = HW_MEMSIZE;
        length = sizeof(long);
        sysctl(mib, 2, &ram, &length, NULL, 0);
        pages  = ram / getpagesize();

        for (i = 0; i < 10; i++) {
                if ((ret = host_statistics64(host, HOST_VM_INFO64, (host_info64_t)&stats, &count)) != KERN_SUCCESS) {
                        printf("oops\n");
                        return 1;
                }

                /* updated for 10.9 */
                missing = pages - (
                        stats.free_count     +
                        stats.active_count   +
                        stats.inactive_count +
                        stats.wire_count     +
                        stats.compressor_page_count
                );

                if (debug) {
                        printf(
                                "%11d pages (# of pages)\n"
                                "%11d free_count (# of pages free) \n"
                                "%11d active_count (# of pages active) \n"
                                "%11d inactive_count (# of pages inactive) \n"
                                "%11d wire_count (# of pages wired down) \n"
                                "%11lld zero_fill_count (# of zero fill pages) \n"
                                "%11lld reactivations (# of pages reactivated) \n"
                                "%11lld pageins (# of pageins) \n"
                                "%11lld pageouts (# of pageouts) \n"
                                "%11lld faults (# of faults) \n"
                                "%11lld cow_faults (# of copy-on-writes) \n"
                                "%11lld lookups (object cache lookups) \n"
                                "%11lld hits (object cache hits) \n"
                                "%11lld purges (# of pages purged) \n"
                                "%11d purgeable_count (# of pages purgeable) \n"
                                "%11d speculative_count (# of pages speculative (also counted in free_count)) \n"
                                "%11lld decompressions (# of pages decompressed) \n"
                                "%11lld compressions (# of pages compressed) \n"
                                "%11lld swapins (# of pages swapped in (via compression segments)) \n"
                                "%11lld swapouts (# of pages swapped out (via compression segments)) \n"
                                "%11d compressor_page_count (# of pages used by the compressed pager to hold all the compressed data) \n"
                                "%11d throttled_count (# of pages throttled) \n"
                                "%11d external_page_count (# of pages that are file-backed (non-swap)) \n"
                                "%11d internal_page_count (# of pages that are anonymous) \n"
                                "%11lld total_uncompressed_pages_in_compressor (# of pages (uncompressed) held within the compressor.) \n",
                                pages, stats.free_count, stats.active_count, stats.inactive_count,
                                stats.wire_count, stats.zero_fill_count, stats.reactivations,
                                stats.pageins, stats.pageouts, stats.faults, stats.cow_faults,
                                stats.lookups, stats.hits, stats.purges, stats.purgeable_count,
                                stats.speculative_count, stats.decompressions, stats.compressions,
                                stats.swapins, stats.swapouts, stats.compressor_page_count,
                                stats.throttled_count, stats.external_page_count,
                                stats.internal_page_count, stats.total_uncompressed_pages_in_compressor
                        );
                }

                printf("%i\n", missing);
                sleep(1);
        }

        return 0;
}

1
vm_stat使用相同的代码,存在同样不完全添加的问题。因此,这并不是您代码本身的问题。 - nneonneo
@nneonneo 哦,我知道了。我来这里是为了弄清楚为什么 vm_stat 给我提供了错误的数据。 - Chas. Owens
@patrix 你有多少个CPU?我目前的工作理论是它与CPU声称的RAM有关(信息来自尚未追踪到的processor_list)。如果缺失页面的数量随着CPU数量的增加而增加,则这将是更多的证据。另外,您的页面大小是多少(我假设为4k)? - Chas. Owens
1
@patrix 我不知道这是否巧合,但看起来大概是对的。我的最坏情况是缺失500到600页,而我有两个CPU。如果我们将其扩展到8个CPU,你会看到缺失2000到2400页。如果是这种情况,那么这些页面目前没有被任何CPU占用;这意味着它们很可能正在被CPU之间传输。 - Chas. Owens
NB!需要添加行 missing -= stats.compressor_page_count; 以获取实际缺失页面的值。 - Maxim Kholyavkin
显示剩余5条评论
2个回答

17

简述:

  • host_statistics64() 从不同的来源获取信息,这可能需要时间并且可能会产生不一致的结果。
  • host_statistics64() 通过名称为vm_page_foo_count的变量获取一些信息。但并非所有这些变量都被考虑在内,例如vm_page_stolen_count没有被考虑在内。
  • 众所周知,/usr/bin/top将“被窃取的页面”添加到“已连接的页面”的数量中。这表明在计算页面时应考虑这些页面。

注释:

  • 我正在使用macOS 10.12,内核版本为Darwin Kernel Version 16.5.0 xnu-3789.51.2~3/RELEASE_X86_64 x86_64,但所有行为都是完全可重现的。
  • 我将链接许多我在我的机器上使用的XNU版本的源代码。它可以在这里找到:xnu-3789.51.2
  • 您编写的程序基本上与/usr/bin/vm_stat相同,后者只是host_statistics64()(和host_statistics())的包装器。相应的源代码可以在这里找到:system_cmds-496/vm_stat.tproj/vm_stat.c

host_statistics64()如何适用于XNU并且它是如何工作的?

众所周知,OS X内核被称为XNU(XNU IS NOT UNIX)“是一种混合内核,将卡内基梅隆大学开发的Mach内核与FreeBSD组件和用于编写驱动程序的C++ API结合在一起。”https://github.com/opensource-apple/xnu/blob/10.12/README.md

虚拟内存管理(VM)是Mach的一部分,因此host_statistics64()位于此处。让我们更仔细地看一下它的实现,它包含在xnu-3789.51.2/osfmk/kern/host.c中。

函数签名为

kern_return_t
host_statistics64(host_t host, host_flavor_t flavor, host_info64_t info, mach_msg_type_number_t * count);

第一行相关的内容是

[...]
processor_t processor;
vm_statistics64_t stat;
vm_statistics64_data_t host_vm_stat;
mach_msg_type_number_t original_count;
unsigned int local_q_internal_count;
unsigned int local_q_external_count;
[...]
processor = processor_list;
stat = &PROCESSOR_DATA(processor, vm_stat);
host_vm_stat = *stat;

if (processor_count > 1) {
    simple_lock(&processor_list_lock);

    while ((processor = processor->processor_list) != NULL) {
        stat = &PROCESSOR_DATA(processor, vm_stat);

        host_vm_stat.zero_fill_count += stat->zero_fill_count;
        host_vm_stat.reactivations += stat->reactivations;
        host_vm_stat.pageins += stat->pageins;
        host_vm_stat.pageouts += stat->pageouts;
        host_vm_stat.faults += stat->faults;
        host_vm_stat.cow_faults += stat->cow_faults;
        host_vm_stat.lookups += stat->lookups;
        host_vm_stat.hits += stat->hits;
        host_vm_stat.compressions += stat->compressions;
        host_vm_stat.decompressions += stat->decompressions;
        host_vm_stat.swapins += stat->swapins;
        host_vm_stat.swapouts += stat->swapouts;
    }

    simple_unlock(&processor_list_lock);
}
[...]

我们获取了类型为vm_statistics64_data_thost_vm_stat。正如您在xnu-3789.51.2/osfmk/mach/vm_statistics.h中所看到的那样,这只是一个typedef struct vm_statistics64。我们从宏PROCESSOR_DATA()中获取处理器信息,该宏定义在xnu-3789.51.2/osfmk/kern/processor_data.h中。我们通过简单地累加相关数字来遍历所有处理器并填充host_vm_stat

正如您所看到的,我们找到了一些众所周知的统计数据,例如zero_fill_countcompressions,但并非所有数据都被host_statistics64()覆盖。

接下来是相关的代码行:

stat = (vm_statistics64_t)info;

stat->free_count = vm_page_free_count + vm_page_speculative_count;
stat->active_count = vm_page_active_count;
[...]
stat->inactive_count = vm_page_inactive_count;
stat->wire_count = vm_page_wire_count + vm_page_throttled_count + vm_lopage_free_count;
stat->zero_fill_count = host_vm_stat.zero_fill_count;
stat->reactivations = host_vm_stat.reactivations;
stat->pageins = host_vm_stat.pageins;
stat->pageouts = host_vm_stat.pageouts;
stat->faults = host_vm_stat.faults;
stat->cow_faults = host_vm_stat.cow_faults;
stat->lookups = host_vm_stat.lookups;
stat->hits = host_vm_stat.hits;

stat->purgeable_count = vm_page_purgeable_count;
stat->purges = vm_page_purged_count;

stat->speculative_count = vm_page_speculative_count;

我们重用stat,并将其作为输出结构。然后,我们通过将两个名为vm_page_free_countvm_page_speculative_countunsigned long相加来填充free_count。我们以相同的方式收集其他剩余的数据(使用名为vm_page_foo_count的变量或从上面填充的host_vm_stat中获取统计信息)。

1. 结论 我们从不同的来源收集数据。无论是来自处理器信息还是称为vm_page_foo_count的变量。这需要时间,并且可能会导致一些不一致性问题,尽管VM是一个非常快速和连续的过程。

让我们更仔细地看一下已经提到的变量vm_page_foo_count。它们在xnu-3789.51.2/osfmk/vm/vm_page.h中定义如下:

extern
unsigned int    vm_page_free_count; /* How many pages are free? (sum of all colors) */
extern
unsigned int    vm_page_active_count;   /* How many pages are active? */
extern
unsigned int    vm_page_inactive_count; /* How many pages are inactive? */
#if CONFIG_SECLUDED_MEMORY
extern
unsigned int    vm_page_secluded_count; /* How many pages are secluded? */
extern
unsigned int    vm_page_secluded_count_free;
extern
unsigned int    vm_page_secluded_count_inuse;
#endif /* CONFIG_SECLUDED_MEMORY */
extern
unsigned int    vm_page_cleaned_count; /* How many pages are in the clean queue? */
extern
unsigned int    vm_page_throttled_count;/* How many inactives are throttled */
extern
unsigned int    vm_page_speculative_count;  /* How many speculative pages are unclaimed? */
extern unsigned int vm_page_pageable_internal_count;
extern unsigned int vm_page_pageable_external_count;
extern
unsigned int    vm_page_xpmapped_external_count;    /* How many pages are mapped executable? */
extern
unsigned int    vm_page_external_count; /* How many pages are file-backed? */
extern
unsigned int    vm_page_internal_count; /* How many pages are anonymous? */
extern
unsigned int    vm_page_wire_count;     /* How many pages are wired? */
extern
unsigned int    vm_page_wire_count_initial; /* How many pages wired at startup */
extern
unsigned int    vm_page_free_target;    /* How many do we want free? */
extern
unsigned int    vm_page_free_min;   /* When to wakeup pageout */
extern
unsigned int    vm_page_throttle_limit; /* When to throttle new page creation */
extern
uint32_t    vm_page_creation_throttle;  /* When to throttle new page creation */
extern
unsigned int    vm_page_inactive_target;/* How many do we want inactive? */
#if CONFIG_SECLUDED_MEMORY
extern
unsigned int    vm_page_secluded_target;/* How many do we want secluded? */
#endif /* CONFIG_SECLUDED_MEMORY */
extern
unsigned int    vm_page_anonymous_min;  /* When it's ok to pre-clean */
extern
unsigned int    vm_page_inactive_min;   /* When to wakeup pageout */
extern
unsigned int    vm_page_free_reserved;  /* How many pages reserved to do pageout */
extern
unsigned int    vm_page_throttle_count; /* Count of page allocations throttled */
extern
unsigned int    vm_page_gobble_count;
extern
unsigned int    vm_page_stolen_count;   /* Count of stolen pages not acccounted in zones */
[...]
extern
unsigned int    vm_page_purgeable_count;/* How many pages are purgeable now ? */
extern
unsigned int    vm_page_purgeable_wired_count;/* How many purgeable pages are wired now ? */
extern
uint64_t    vm_page_purged_count;   /* How many pages got purged so far ? */

虽然我们只能使用host_statistics64()访问非常有限的数量,但这是大量统计数据。其中大部分统计数据在xnu-3789.51.2/osfmk/vm/vm_resident.c中更新。例如,此函数将页面释放到空闲页面列表中:

/*
*   vm_page_release:
*
*   Return a page to the free list.
*/

void
vm_page_release(
    vm_page_t   mem,
    boolean_t   page_queues_locked)
{
    [...]
    vm_page_free_count++;
    [...]
}

非常有趣的是extern unsigned int vm_page_stolen_count; /* Count of stolen pages not acccounted in zones */。什么是被盗页面?似乎有机制可以将页面从某些列表中取出,即使它通常不会被分页。其中一个机制是页面在推测页面列表中的年龄xnu-3789.51.2/osfmk/vm/vm_page.h告诉我们

* VM_PAGE_MAX_SPECULATIVE_AGE_Q * VM_PAGE_SPECULATIVE_Q_AGE_MS
* defines the amount of time a speculative page is normally
* allowed to live in the 'protected' state (i.e. not available
* to be stolen if vm_pageout_scan is running and looking for
* pages)...  however, if the total number of speculative pages
* in the protected state exceeds our limit (defined in vm_pageout.c)
* and there are none available in VM_PAGE_SPECULATIVE_AGED_Q, then
* vm_pageout_scan is allowed to steal pages from the protected
* bucket even if they are underage.
*
* vm_pageout_scan is also allowed to pull pages from a protected
* bin if the bin has reached the "age of consent" we've set

确实是void vm_pageout_scan(void)会增加vm_page_stolen_count。你可以在xnu-3789.51.2/osfmk/vm/vm_pageout.c中找到相应的源代码。

我认为,在计算VM统计信息时,host_statistics64()不会考虑被窃取的页面。

我正确的证据

证明这一点的最好方法是手动编译带有自定义版本host_statistics64()的XNU。我还没有机会这样做,但很快会尝试。

幸运的是,我们不是唯一对正确的虚拟机统计数据感兴趣的人。因此,我们应该看看著名的/usr/bin/top(不包含在XNU中)实现,它完全可以在这里找到:top-108(我只选择了macOS 10.12.4 release)。

让我们来看看top-108/libtop.c,我们会发现以下内容:

static int
libtop_tsamp_update_vm_stats(libtop_tsamp_t* tsamp) {
    kern_return_t kr;
    tsamp->p_vm_stat = tsamp->vm_stat;

    mach_msg_type_number_t count = sizeof(tsamp->vm_stat) / sizeof(natural_t);
    kr = host_statistics64(libtop_port, HOST_VM_INFO64, (host_info64_t)&tsamp->vm_stat, &count);
    if (kr != KERN_SUCCESS) {
        return kr;
    }

    if (tsamp->pages_stolen > 0) {
        tsamp->vm_stat.wire_count += tsamp->pages_stolen;
    }

    [...]

    return kr;
}

tsamplibtop_tsamp_t 类型的,它是在 top-108/libtop.h 中定义的结构体。其中包含了 vm_statistics64_data_t vm_statuint64_t pages_stolen 等内容。

正如您所看到的,static int libtop_tsamp_update_vm_stats(libtop_tsamp_t* tsamp) 通过 host_statistics64() 填充了 tsamp->vm_stat。然后它检查是否 tsamp->pages_stolen > 0 并将其添加到 tsamp->vm_statwire_count 字段中。

2. 结论 如果我们只使用 host_statistics64(),就无法得到这些被窃取页面的数量,就像在 /usr/bin/vm_stat 或您的示例代码中一样!

为什么要实现 host_statistics64()

老实说,我也不知道。页面分页是一个复杂的过程,因此进行实时观察是一项具有挑战性的任务。我们必须注意到其实现中似乎没有错误。我认为即使我们可以访问vm_page_stolen_count,我们甚至也无法获得100%准确的页面数。如果被窃的页面数量不是很多,/usr/bin/top的实现不会计算被窃的页面。

另一个有趣的事情是函数static void update_pages_stolen(libtop_tsamp_t *tsamp)上面的注释,即/* This is for <rdar://problem/6410098>. */。Open Radar是苹果软件的漏洞报告站点,并且通常按照评论中给定的格式对漏洞进行分类。我无法找到相关的漏洞;也许是关于缺少页面的问题。

希望这些信息能对你有所帮助。如果我设法在我的机器上编译最新版本(和定制版本)的XNU,我会让你知道的。也许这会带来有趣的见解。


1

刚刚注意到,如果将compressor_page_count加入计算中,可以更接近机器实际的内存使用量。

这只是一种观察,并非解释,如果能提供相关文档的链接就更好了!


压缩页面在10.9版本中发布,正如问题所述,这是10.6版本。我已经更新了10.9版本的代码,但是与10.6版本相比,它仍然缺少大约相同数量的页面(包括添加的字段)。 - Chas. Owens

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接