计算最常见的值

Question

计算最常见的值

matlabcountmatrix

4

如果我有一个由65到90的n个值组成的矩阵A，我该如何获取其中出现最多的10个值？我希望结果是一个10x2的矩阵B，其中第一列为出现最多的10个值，第二列为它们出现的次数。

- gustav

5个回答

1

使用arrayfun()很容易解决这个问题。

A = [...]; % Your target matrix with values 65:90
labels = 65:90 % Possible values to look for
nTimesOccured = arrayfun(@(x) sum(A(:) == x), labels);
[sorted sortidx] = sort(nTimesOccured, 'descend');

B = [labels(sortidx(1:10))' sorted(1:10)'];

- Hannes Ovrén

1

这个问题也可以用accumarray解决

ncounts = accumarray(A(:),1);  %ncounts should now be a 90 x 1 vector of counts
[vals,sidx] = sort(ncounts,'descend');   %vals has the counts, sidx has the number
B = [sidx(1:10),vals(1:10)];

accumarray的速度不如预期快，但通常比其它同类操作更快。我花了很多时间阅读它的帮助页面才理解它在做什么。对于你的目的而言，它可能比histc方案慢一些，但更加直观。

--编辑：在accumarray调用中忘记了'1'。

- shabbychef

那不是使用 accumarray 的正确方式！看一下 Doug Hull 的这个视频，它展示了该函数的典型用法：http://blogs.mathworks.com/videos/2009/10/02/basics-using-accumarray/ - Amro

是的，我忘记了1。然而，这就是accumarray的本质。我认为它是一种快速、明确定义的方法来执行output(idx) += vals。尽管您的评论，但这是使用accumarray的正确方式。 - shabbychef

1

我们可以使用统计工具箱中的tabulate来添加第四个选项：

A = randi([65 90], [1000 1]);   %# thousand random integers in the range 65:90
t = sortrows(tabulate(A), -2);  %# compute sorted frequency table
B = t(1:10, 1:2);               %# take the top 10

- Amro

1

哎呀，这里还有另一个解决方案，所有简单的内置命令

[V, I] = unique(sort(A(:)));
M = sortrows([V, diff([0; I])], -2);
Top10 = M(1:10, :);

第一行：对所有值进行排序，然后查找每个新值在排序列表中开始的偏移量。第二行：计算每个唯一值的偏移差异，并对这些结果进行排序。

顺便说一句，如果可能数字的范围非常大，例如[0,1E8]，我只会建议使用此方法。在这种情况下，其他一些方法可能会出现内存不足错误。

- catchmeifyoutry

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Debilski · Accepted Answer

A = [65 82 65 90; 90 70 72 82]; % Your data
range = 65:90;
res = [range; histc(A(:)', range)]'; % res has values in first column, counts in second.

现在你需要做的就是按照第二列对res数组进行排序，并取前10行。

sortedres = sortrows(res, -2); % sort by second column, descending
first10 = sortedres(1:10, :)