Matlab:使用索引递增矩阵值

4

我有一个索引向量,想要增加矩阵中每个索引处的值。例如:

    ind = [1 2 2 5];
    m = zeros(3);
    m(ind) = m(ind) + 1;

以下是结果:
    m = [1 0 0
         1 1 0
         0 0 0]

但我需要结果是什么?
    m = [1 0 0
         2 1 0
         0 0 0]

时间复杂度对我非常重要,不能使用for循环。谢谢。

请注意,有时候矢量化的解决方案并不比for循环更快,因为MATLAB在最近的版本中已经对for循环进行了大幅度优化。 - Ander Biguri
@AnderBiguri 即使在这种情况下? - user137927
1
了解的最好方式是尝试。 - Ander Biguri
3个回答

6
这是一种方法。我没有计时。
ind = [1 2 2 5];
N = 3;
m = full(reshape(sparse(ind, 1, 1, N^2, 1), N, N));

同样地,您可以使用

ind = [1 2 2 5];
N = 3;
m = reshape(accumarray(ind(:), 1, [N^2 1]), N, N);

或其变体(感谢@beaker)。
ind = [1 2 2 5];
N = 3;
m = zeros(N);
m(:) = accumarray(ind(:), 1, [N^2 1]);

这个可能比其他的慢:
ind = [1 2 2 5];
N = 3;
m = zeros(N);
[ii, ~, vv] = find(accumarray(ind(:), 1));
m(ii) = vv;

很棒的向量化解决方案!不过,我有一种感觉,第一个肯定比for循环需要更多时间。 - Ander Biguri

4

对于一个已排序的索引数组,我们可以使用diff函数进行操作 -

out = zeros(M,N);  % Output array of size(M,N)
df = diff([0,ind,ind(end)+1]);
put_idx = diff(find(df)); % gets count of dups
out(ind(df(1:end-1)~=0)) = put_idx;

基本思想是使用 diff 沿长度计算重复项的数量。这些计数是要分配到零数组中的值。分配这些值的索引只是唯一索引,可以通过查找每个重复索引组的开头来找到。

基准测试

创建排序索引数组的脚本 (create_data.m) -

function ind = create_data(M,N, num_unq_ind, max_repeats)

unq_ind = unique(randi([1,M*N],1,num_unq_ind));
num_repeats = randi(max_repeats, [1,numel(unq_ind)]);
ind = repelem(unq_ind, num_repeats);

基准测试脚本 (bench1.m) 用于测试各种场景 -

clear all; close all;

M = 5000; % Array size
N = 5000;

% Input params and setup input indices array (edited for various runs)
num_unq_ind = 100000;
max_repeats = 100;
ind = create_data(M,N, num_unq_ind, max_repeats);

num_iter = 100; % No. of iterations to have reliable benchmarking 
disp('Input params :')
disp(['num_unq_ind = ' int2str(num_unq_ind)])
disp(['max_repeats = ' int2str(max_repeats)])

disp('------------------ Using diff ----------------')
tic
for i=1:num_iter
    out = zeros(M,N);
    df = diff([0,ind,ind(end)+1]);
    put_idx = diff(find(df));
    out(ind(df(1:end-1)~=0)) = put_idx;
end
toc

% Luis's soln
disp('------------------ Using accumaray ----------------')
tic
for i=1:num_iter
    m = reshape(accumarray(ind(:), 1, [N^2 1]), N, N);
end
toc

各种场景运行 -

>> bench1
Input params :
num_unq_ind = 10000
max_repeats = 10
------------------ Using diff ----------------
Elapsed time is 0.948544 seconds.
------------------ Using accumaray ----------------
Elapsed time is 1.502658 seconds.
>> bench1
Input params :
num_unq_ind = 100000
max_repeats = 10
------------------ Using diff ----------------
Elapsed time is 1.784576 seconds.
------------------ Using accumaray ----------------
Elapsed time is 1.533280 seconds.
>> bench1
Input params :
num_unq_ind = 10000
max_repeats = 100
------------------ Using diff ----------------
Elapsed time is 1.315998 seconds.
------------------ Using accumaray ----------------
Elapsed time is 1.391323 seconds.
>> bench1
Input params :
num_unq_ind = 100000
max_repeats = 100
------------------ Using diff ----------------
Elapsed time is 6.180565 seconds.
------------------ Using accumaray ----------------
Elapsed time is 3.576154 seconds.

使用更少的稀疏和更多的重复,accumarray 看起来表现更好。


3
您可以使用 histcounts
n = 3;
m = reshape(histcounts(ind, [1:n^2 n^2]), n, n);

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接