在Python中将二维数组转换为三维数组

3

非常抱歉,如果这个问题已经被问过了,但在我的情况下,我有一个特殊的矩阵大小为3000000x50,我想把它分成300个大小为10000x50的矩阵。我试过这个方法,但它不起作用。

>>>import numpy as np
>>>data=np.random.randn(3000000,50)
>>>D=np.matrix.conjugate(data)
>>>ts=50
>>>ts=int(ts)       #number of time series that we have from our data
>>>lw=1e4
>>>lw=int(lw)    #length of each window 
>>>l=len(data)/lw   #l is number of windows
>>>l=np.floor(l)
>>>l=int(l)
#Dc is used to seperate each time series in l windows
>>>Dc=np.zeros((l,lw,ts))
>>>for i in range(l):
    Dc[i][0:lw-1][0:ts-1]=D[(lw)*(i):(lw*(i+1))-1][0:ts-1]

为什么你不直接使用 np.split() 函数:new_array=np.split(D,300) - Mazdak
2个回答

5

您正在寻找np.vsplit将数组垂直(按行)分成多个子数组) -

np.vsplit(data,300)

样例运行 -

In [56]: data
Out[56]: 
array([[ 0.46677419,  0.07402051,  0.87270029,  0.12481164],
       [ 0.40789713,  0.36018843,  0.41731607,  0.17348898],
       [ 0.4701256 ,  0.10056201,  0.31289602,  0.18681709],
       [ 0.52407036,  0.89913995,  0.59097535,  0.38376443],
       [ 0.06734662,  0.24470334,  0.09523911,  0.35680219],
       [ 0.91178257,  0.58710922,  0.75099017,  0.24929987]])

In [57]: np.vsplit(data,3)
Out[57]: 
[array([[ 0.46677419,  0.07402051,  0.87270029,  0.12481164],
        [ 0.40789713,  0.36018843,  0.41731607,  0.17348898]]),
 array([[ 0.4701256 ,  0.10056201,  0.31289602,  0.18681709],
        [ 0.52407036,  0.89913995,  0.59097535,  0.38376443]]),
 array([[ 0.06734662,  0.24470334,  0.09523911,  0.35680219],
        [ 0.91178257,  0.58710922,  0.75099017,  0.24929987]])]

根据您的使用需求,您可以将2D输入数组重新塑形为一个3D数组,该数组在第一个轴上的长度必须为300。这将在性能和内存方面更加高效。内存方面,它必须是免费的,因为reshaping只创建了NumPy数组的一个视图。实现如下:

data.reshape(300,-1,data.shape[1])

样例运行 -

In [68]: data
Out[68]: 
array([[ 0.46677419,  0.07402051,  0.87270029,  0.12481164],
       [ 0.40789713,  0.36018843,  0.41731607,  0.17348898],
       [ 0.4701256 ,  0.10056201,  0.31289602,  0.18681709],
       [ 0.52407036,  0.89913995,  0.59097535,  0.38376443],
       [ 0.06734662,  0.24470334,  0.09523911,  0.35680219],
       [ 0.91178257,  0.58710922,  0.75099017,  0.24929987]])

In [69]: data.reshape(3,-1,data.shape[1])
Out[69]: 
array([[[ 0.46677419,  0.07402051,  0.87270029,  0.12481164],
        [ 0.40789713,  0.36018843,  0.41731607,  0.17348898]],

       [[ 0.4701256 ,  0.10056201,  0.31289602,  0.18681709],
        [ 0.52407036,  0.89913995,  0.59097535,  0.38376443]],

       [[ 0.06734662,  0.24470334,  0.09523911,  0.35680219],
        [ 0.91178257,  0.58710922,  0.75099017,  0.24929987]]])

这里有一些运行时测试,旨在比较实际的拆分和重塑方法的性能 -
In [72]: data = np.random.rand(6000,40)

In [73]: %timeit np.vsplit(data,300)
100 loops, best of 3: 7.05 ms per loop

In [74]: %timeit data.reshape(300,-1,data.shape[1])
1000000 loops, best of 3: 1.08 µs per loop

1
读者请使用reshape部分而不是(v)split - askewchan
@askewchan 我完全同意! - Divakar
谢谢你的回答,它对我很有帮助。比较测试非常有用。 - amin.akhshi
vsplit需要进行均匀分割,不能有余数。所有子分割必须大小相等。 - Kermit

2

如果您的初始数组已正确排序,并且您想将数组分割成300个矩阵“盒子”,则只需重新定义矩阵如下:

import numpy as np
data = np.random.randn(3000000,50)
newData = data.reshape(300,10000,50) # This is as [300,10000,50] array

print newData[0,...] # Show the first matrix, 1 of 300

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接