按元素频率对数组元素进行排序

Question

按元素频率对数组元素进行排序

3

在matlab/octave中，是否可以使用sort函数根据元素的相对频率对数组进行排序？

例如，给定一个数组

m= [4,4,4,10,10,10,4,4,5]

应该得到这个数组：

[5,10,10,10,4,4,4,4,4]

5 是出现次数较少的元素，位于顶部，而 4 则是出现最频繁的元素，位于底部。是否应该使用 histcount 提供的索引？

- linello

3个回答

3

以下代码首先计算每个元素出现的次数，然后使用runLengthDecode来展开唯一的元素。

m = [4,4,4,10,10,10,4,4,5];

u_m = unique(m);

elem_count = histc(m,u_m);
[elem_count, idx] = sort(elem_count);

m_sorted = runLengthDecode(elem_count, u_m(idx));

runLengthDecode的定义是从这个答案复制过来的：

对于MATLAB R2015a+：

function V = runLengthDecode(runLengths, values)
if nargin<2
    values = 1:numel(runLengths);
end
V = repelem(values, runLengths);
end

针对 R2015a 之前的版本:

function V = runLengthDecode(runLengths, values)
%// Actual computation using column vectors
V = cumsum(accumarray(cumsum([1; runLengths(:)]), 1));
V = V(1:end-1);
%// In case of second argument
if nargin>1
    V = reshape(values(V),[],1);
end
%// If original was a row vector, transpose
if size(runLengths,2)>1
    V = V.'; %'
end
end

- m.s.

除了您已经提供的链接外，您应该在您的回答中发布runlegnthDecode的实际代码...还要注意，repelem是相对较新的函数，因此在旧版的Matlab上无法使用（但很容易使其正常工作）。尽管通过结合我们两个的答案，您可以避免使用runlengthDecode，而只需使用sort... - Dan

@Dan 我从原始答案中复制了代码；它并不一定依赖于 repelem。 - m.s.

2

你可以使用bsxfun对重复次数进行计数，再使用sort对其排序，并将该排序应用于m：

[~, ind] = sort(sum(bsxfun(@eq,m,m.')));
result = m(ind);

- Luis Mendo

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Dan · Accepted Answer

一种方法是使用accumarray来查找每个数字的计数（我怀疑你可以使用histcounts(m,max(m)))，但是然后你必须清除所有的0）。

m = [4,4,4,10,10,10,4,4,5];

[~,~,subs]=unique(m);
freq = accumarray(subs,subs,[],@numel);
[~,i2] = sort(freq(subs),'descend');

m(i2)

通过结合我的方法和m.s.的方法，您可以得到一个更简单的解决方案：

m = [4,4,4,10,10,10,4,4,5];

[U,~,i1]=unique(m);
freq= histc(m,U);
[~,i2] = sort(freq(i1),'descend');

m(i2)