将2D数组切片成更小的2D数组

Question

将2D数组切片成更小的2D数组

72

有没有办法在numpy中将二维数组切分成更小的二维数组？

示例

[[1,2,3,4],   ->    [[1,2] [3,4]   
 [5,6,7,8]]          [5,6] [7,8]]

所以，我基本上想将一个2x4的数组切成2个2x2的数组。寻找一个通用的解决方案，可用于图像处理。

- TheMeaningfulEngineer

11个回答

8

我认为这是使用numpy.split或其变体的任务。

例如：

a = np.arange(30).reshape([5,6])  #a.shape = (5,6)
a1 = np.split(a,3,axis=1) 
#'a1' is a list of 3 arrays of shape (5,2)
a2 = np.split(a, [2,4])
#'a2' is a list of three arrays of shape (2,5), (2,5), (1,5)

如果您有一张NxN的图片，您可以创建一个2 NxN/2子图列表，然后沿着另一个轴进行划分。 numpy.hsplit和numpy.vsplit也是可用的。

- Francesco Montesano

7

有一些其他的答案似乎已经很适合你的具体情况了，但是你的问题引起了我的兴趣，我开始探索可能的方法来实现最大数量维度上的内存高效解决方案，结果我花了大部分下午的时间。（这个方法本身相对简单，只是我还没有使用numpy支持的大多数非常高级的特性，所以大部分时间都用在研究numpy有什么可用的和它可以做多少，以便我不必自己做。）

def blockgen(array, bpa):
    """Creates a generator that yields multidimensional blocks from the given
array(_like); bpa is an array_like consisting of the number of blocks per axis
(minimum of 1, must be a divisor of the corresponding axis size of array). As
the blocks are selected using normal numpy slicing, they will be views rather
than copies; this is good for very large multidimensional arrays that are being
blocked, and for very large blocks, but it also means that the result must be
copied if it is to be modified (unless modifying the original data as well is
intended)."""
    bpa = np.asarray(bpa) # in case bpa wasn't already an ndarray

    # parameter checking
    if array.ndim != bpa.size:         # bpa doesn't match array dimensionality
        raise ValueError("Size of bpa must be equal to the array dimensionality.")
    if (bpa.dtype != np.int            # bpa must be all integers
        or (bpa < 1).any()             # all values in bpa must be >= 1
        or (array.shape % bpa).any()): # % != 0 means not evenly divisible
        raise ValueError("bpa ({0}) must consist of nonzero positive integers "
                         "that evenly divide the corresponding array axis "
                         "size".format(bpa))


    # generate block edge indices
    rgen = (np.r_[:array.shape[i]+1:array.shape[i]//blk_n]
            for i, blk_n in enumerate(bpa))

    # build slice sequences for each axis (unfortunately broadcasting
    # can't be used to make the items easy to operate over
    c = [[np.s_[i:j] for i, j in zip(r[:-1], r[1:])] for r in rgen]

    # Now to get the blocks; this is slightly less efficient than it could be
    # because numpy doesn't like jagged arrays and I didn't feel like writing
    # a ufunc for it.
    for idxs in np.ndindex(*bpa):
        blockbounds = tuple(c[j][idxs[j]] for j in range(bpa.size))

        yield array[blockbounds]

- JAB

3

你的问题和这个问题实际上是差不多的。你可以使用一个一行代码的解决方案，使用np.ndindex()和reshape()函数：

def cutter(a, r, c):
    lenr = a.shape[0]/r
    lenc = a.shape[1]/c
    np.array([a[i*r:(i+1)*r,j*c:(j+1)*c] for (i,j) in np.ndindex(lenr,lenc)]).reshape(lenr,lenc,r,c)

为了创建你想要的结果：

a = np.arange(1,9).reshape(2,1)
#array([[1, 2, 3, 4],
#       [5, 6, 7, 8]])

cutter( a, 1, 2 )
#array([[[[1, 2]],
#        [[3, 4]]],
#       [[[5, 6]],
#        [[7, 8]]]])

- Saullo G. P. Castro

3

TheMeaningfulEngineer的回答中有一些小改进，可以处理当大的2D数组无法完美地切分成相同大小的子数组的情况。

def blockfy(a, p, q):
    '''
    Divides array a into subarrays of size p-by-q
    p: block row size
    q: block column size
    '''
    m = a.shape[0]  #image row size
    n = a.shape[1]  #image column size

    # pad array with NaNs so it can be divided by p row-wise and by q column-wise
    bpr = ((m-1)//p + 1) #blocks per row
    bpc = ((n-1)//q + 1) #blocks per column
    M = p * bpr
    N = q * bpc

    A = np.nan* np.ones([M,N])
    A[:a.shape[0],:a.shape[1]] = a

    block_list = []
    previous_row = 0
    for row_block in range(bpc):
        previous_row = row_block * p   
        previous_column = 0
        for column_block in range(bpr):
            previous_column = column_block * q
            block = A[previous_row:previous_row+p, previous_column:previous_column+q]

            # remove nan columns and nan rows
            nan_cols = np.all(np.isnan(block), axis=0)
            block = block[:, ~nan_cols]
            nan_rows = np.all(np.isnan(block), axis=1)
            block = block[~nan_rows, :]

            ## append
            if block.size:
                block_list.append(block)

    return block_list

示例：

a = np.arange(25)
a = a.reshape((5,5))
out = blockfy(a, 2, 3)

a->
array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19],
       [20, 21, 22, 23, 24]])

out[0] ->
array([[0., 1., 2.],
       [5., 6., 7.]])

out[1]->
array([[3., 4.],
       [8., 9.]])

out[-1]->
array([[23., 24.]])

- Aenaon

2

如果您需要一个处理矩阵不等分情况的解决方案，可以使用以下方法：

from operator import add
half_split = np.array_split(input, 2)

res = map(lambda x: np.array_split(x, 2, axis=1), half_split)
res = reduce(add, res)

- warmspringwinds

2

目前，只有当大的二维数组可以完美地切成大小相同的子数组时，它才能正常工作。

下面的代码是切片的示例：

a ->array([[ 0,  1,  2,  3,  4,  5],
           [ 6,  7,  8,  9, 10, 11],
           [12, 13, 14, 15, 16, 17],
           [18, 19, 20, 21, 22, 23]])

转化为这样

block_array->
    array([[[ 0,  1,  2],
            [ 6,  7,  8]],

           [[ 3,  4,  5],
            [ 9, 10, 11]],

           [[12, 13, 14],
            [18, 19, 20]],

           [[15, 16, 17],
            [21, 22, 23]]])

p 和 q 决定了块的大小。

代码

a = arange(24)
a = a.reshape((4,6))
m = a.shape[0]  #image row size
n = a.shape[1]  #image column size

p = 2     #block row size
q = 3     #block column size

block_array = []
previous_row = 0
for row_block in range(blocks_per_row):
    previous_row = row_block * p   
    previous_column = 0
    for column_block in range(blocks_per_column):
        previous_column = column_block * q
        block = a[previous_row:previous_row+p,previous_column:previous_column+q]
        block_array.append(block)

block_array = array(block_array)

- TheMeaningfulEngineer

2

a = np.random.randint(1, 9, size=(9,9))
out = [np.hsplit(x, 3) for x in np.vsplit(a,3)]
print(a)
print(out)

yields

[[7 6 2 4 4 2 5 2 3]
 [2 3 7 6 8 8 2 6 2]
 [4 1 3 1 3 8 1 3 7]
 [6 1 1 5 7 2 1 5 8]
 [8 8 7 6 6 1 8 8 4]
 [6 1 8 2 1 4 5 1 8]
 [7 3 4 2 5 6 1 2 7]
 [4 6 7 5 8 2 8 2 8]
 [6 6 5 5 6 1 2 6 4]]
[[array([[7, 6, 2],
       [2, 3, 7],
       [4, 1, 3]]), array([[4, 4, 2],
       [6, 8, 8],
       [1, 3, 8]]), array([[5, 2, 3],
       [2, 6, 2],
       [1, 3, 7]])], [array([[6, 1, 1],
       [8, 8, 7],
       [6, 1, 8]]), array([[5, 7, 2],
       [6, 6, 1],
       [2, 1, 4]]), array([[1, 5, 8],
       [8, 8, 4],
       [5, 1, 8]])], [array([[7, 3, 4],
       [4, 6, 7],
       [6, 6, 5]]), array([[2, 5, 6],
       [5, 8, 2],
       [5, 6, 1]]), array([[1, 2, 7],
       [8, 2, 8],
       [2, 6, 4]])]]

- shahar_m

1

这是基于unutbu的答案的解决方案，处理矩阵无法等分的情况。在这种情况下，它将在使用一些插值之前调整矩阵大小。您需要OpenCV来实现此操作。请注意，我不得不交换ncols和nrows才能使其正常工作，但不知道为什么。

import numpy as np
import cv2
import math 

def blockshaped(arr, r_nbrs, c_nbrs, interp=cv2.INTER_LINEAR):
    """
    arr      a 2D array, typically an image
    r_nbrs   numbers of rows
    r_cols   numbers of cols
    """

    arr_h, arr_w = arr.shape

    size_w = int( math.floor(arr_w // c_nbrs) * c_nbrs )
    size_h = int( math.floor(arr_h // r_nbrs) * r_nbrs )

    if size_w != arr_w or size_h != arr_h:
        arr = cv2.resize(arr, (size_w, size_h), interpolation=interp)

    nrows = int(size_w // r_nbrs)
    ncols = int(size_h // c_nbrs)

    return (arr.reshape(r_nbrs, ncols, -1, nrows) 
               .swapaxes(1,2)
               .reshape(-1, ncols, nrows))

- snoob dogg

0

我发布了我的解决方案。请注意，这段代码实际上并不创建原始数组的副本，因此它可以很好地处理大数据。此外，如果数组不能被均匀分割，它也不会崩溃（但是您可以通过删除ceil并检查v_slices和h_slices是否能够整除来轻松添加条件）。

import numpy as np
from math import ceil

a = np.arange(9).reshape(3, 3)

p, q = 2, 2
width, height = a.shape

v_slices = ceil(width / p)
h_slices = ceil(height / q)

for h in range(h_slices):
    for v in range(v_slices):
        block = a[h * p : h * p + p, v * q : v * q + q]
        # do something with a block

这段代码可以改变（或者更准确地说，直接让你访问数组的一部分）以下内容：

[[0 1 2]
 [3 4 5]
 [6 7 8]]

转换为：

[[0 1]
 [3 4]]
[[2]
 [5]]
[[6 7]]
[[8]]

如果您需要实际的副本，Aenaon code 就是您要找的。

如果您确定大数组可以均匀分割，您可以使用 numpy splitting 工具。

- serwus

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- unutbu · Accepted Answer

几个月前另一个问题启发了我使用reshape和swapaxes的想法。因为这样可以保持第一个块的行在一起，所以h//nrows是有意义的。同时也很容易理解nrows 和 ncols需要成为形状的一部分。-1告诉reshape填充任何必要的数字以使重塑有效。掌握了解决方案的形式后，我只是尝试各种东西，直到找到适合的公式。

你应该能够使用reshape和swapaxes来将数组分成“块”：

def blockshaped(arr, nrows, ncols):
    """
    Return an array of shape (n, nrows, ncols) where
    n * nrows * ncols = arr.size

    If arr is a 2D array, the returned array should look like n subblocks with
    each subblock preserving the "physical" layout of arr.
    """
    h, w = arr.shape
    assert h % nrows == 0, f"{h} rows is not evenly divisible by {nrows}"
    assert w % ncols == 0, f"{w} cols is not evenly divisible by {ncols}"
    return (arr.reshape(h//nrows, nrows, -1, ncols)
               .swapaxes(1,2)
               .reshape(-1, nrows, ncols))

轮换 c

np.random.seed(365)
c = np.arange(24).reshape((4, 6))
print(c)

[out]:
[[ 0  1  2  3  4  5]
 [ 6  7  8  9 10 11]
 [12 13 14 15 16 17]
 [18 19 20 21 22 23]]

进入

print(blockshaped(c, 2, 3))

[out]:
[[[ 0  1  2]
  [ 6  7  8]]

 [[ 3  4  5]
  [ 9 10 11]]

 [[12 13 14]
  [18 19 20]]

 [[15 16 17]
  [21 22 23]]]

我在这里发布了一个反函数unblockshaped的答案，以及一个N维泛化版本的答案。该泛化版本提供了更多关于此算法背后原理的见解。

请注意，还有一个名为blockwise_view的答案。它以不同的格式（使用更多轴）排列块，但具有以下优点：（1）始终返回视图，（2）能够处理任何维度的数组。