将NumPy数组转换为置换矩阵

Question

将NumPy数组转换为置换矩阵

4

np.array([1,2,3])

我有一个numpy数组。我想将它转换为一个每个1:1排列的元组的numpy数组。像这样：

np.array([
    [(1,1),(1,2),(1,3)],
    [(2,1),(2,2),(2,3)],
    [(3,1),(3,2),(3,3)],
])

有没有关于如何高效地完成这个操作的想法？我需要执行几百万次。

- user3439329

5个回答

5

如果你正在使用numpy，不要使用元组。利用它的能力并添加另一个大小为2的维度。我的建议是:

x = np.array([1,2,3])
np.vstack(([np.vstack((x, x, x))], [np.vstack((x, x, x)).T])).T

或者：

im = np.vstack((x, x, x))
np.vstack(([im], [im.T])).T

对于一个普通的数组：

ix = np.vstack([x for _ in range(x.shape[0])])
return np.vstack(([ix], [ix.T])).T

这将产生您想要的内容：

array([[[1, 1],
        [1, 2],
        [1, 3]],

       [[2, 1],
        [2, 2],
        [2, 3]],

       [[3, 1],
        [3, 2],
        [3, 3]]])

但是作为一个3D矩阵，当你看到它的形状时：

Out[25]: (3L, 3L, 2L)

随着数组大小的增加，这种方法比使用排列的解决方案更高效。对于大小为100的数组，我的解决方案需要1毫秒，而使用排列的解决方案则需要46毫秒。@AshwiniChaudhary的解决方案更加高效。

- Korem

2

使用 numpy.meshgrid 的另一种方法。

>>> x = np.array([1, 2, 3])
>>> perms = np.stack(np.meshgrid(x, x))
>>> perms
array([[[1, 2, 3],
        [1, 2, 3],
        [1, 2, 3]],

       [[1, 1, 1],
        [2, 2, 2],
        [3, 3, 3]]])
>>> perms.transpose().reshape(9, 2)
array([[1, 1],
       [1, 2],
       [1, 3],
       [2, 1],
       [2, 2],
       [2, 3],
       [3, 1],
       [3, 2],
       [3, 3]])

- Bill

我还不确定转置和重塑在做什么，但这能给我一个排列列表，将其推回预测模型，然后重塑它以制作数据集的漂亮三维图形。 - Elton Clark

并不是真正的魔法，只是将3D数组的结果重新塑形为2D。但这并非必要。meshgrid的结果可以说是最接近原始问题所要求的。 - Bill

1

你可以使用 itertools.product 获取排列组合，然后将结果转换为 numpy 数组。

>>> from itertools import product
>>> p=list(product(a,repeat=2))
>>> np.array([p[i:i+3] for i in range(0,len(p),3)])
array([[[1, 1],
        [1, 2],
        [1, 3]],

       [[2, 1],
        [2, 2],
        [2, 3]],

       [[3, 1],
        [3, 2],
        [3, 3]]])

- Mazdak

我认为这种方法可能是最快的。但是为什么不在第二步中直接执行 np.array(p).reshape(3, 3, 2) 呢？ - Bill

1

我正在研究如何更好地完成这个任务，不仅仅是对于2元组。实际上，可以使用np.indices非常优雅地完成这个任务，它可以用来生成一组索引以索引原始数组：

>>> x = np.array([1, 2, 3])
>>> i = np.indices((3, 3)).reshape(2, -1)
>>> a[i].T
array([[1, 1],
       [1, 2],
       [1, 3],
       [2, 1],
       [2, 2],
       [2, 3],
       [3, 1],
       [3, 2],
       [3, 3]])

一般情况下，按照以下方式进行：设 n 为每个排列中的项目数。

n = 5
x = np.arange(10)

i = np.indices([x.size for _ in range(n)]).reshape(n, -1)
a = x[i].T

然后，如果需要，您可以将结果重塑为n维数组形式，但通常排列已足够。我没有测试过此方法的性能，但肯定本机numpy调用和索引应该非常快。至少在我看来，这比其他解决方案更优雅。这与@Bill提供的meshgrid解决方案非常相似。

- Felix

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Ashwini Chaudhary · Accepted Answer

你可以像这样做：

你可以这样做：

>>> a = np.array([1, 2, 3])
>>> n = a.size
>>> np.vstack((np.repeat(a, n), np.tile(a, n))).T.reshape(n, n, 2)
array([[[1, 1],
        [1, 2],
        [1, 3]],

       [[2, 1],
        [2, 2],
        [2, 3]],

       [[3, 1],
        [3, 2],
        [3, 3]]])

或者，就像Jaime建议的那样，如果我们在这里利用广播，可以获得大约10倍的加速：

>>> a = np.array([1, 2, 3])
>>> n = a.size                 
>>> perm = np.empty((n, n, 2), dtype=a.dtype)
perm[..., 0] = a[:, None]
perm[..., 1] = a
... 
>>> perm
array([[[1, 1],
        [1, 2],
        [1, 3]],

       [[2, 1],
        [2, 2],
        [2, 3]],

       [[3, 1],
        [3, 2],
        [3, 3]]])

计时比较：

>>> a = np.array([1, 2, 3]*100)
>>> %%timeit                   
np.vstack((np.repeat(a, n), np.tile(a, n))).T.reshape(n, n, 2)
... 
1000 loops, best of 3: 934 µs per loop
>>> %%timeit                   
perm = np.empty((n, n, 2), dtype=a.dtype)                     
perm[..., 0] = a[:, None]
perm[..., 1] = a
... 
10000 loops, best of 3: 111 µs per loop