以下是一种使用 零初始化
的方法 -
def padcols(arr,padlen):
N = 1+2*padlen
m,n = arr.shape
out = np.zeros((m,N*n),dtype=arr.dtype)
out[:,padlen+np.arange(n)*N] = arr
return out
样例运行 -
In [118]: arr
Out[118]:
array([[21, 14, 23],
[52, 70, 90],
[40, 57, 11],
[71, 33, 78]])
In [119]: padcols(arr,1)
Out[119]:
array([[ 0, 21, 0, 0, 14, 0, 0, 23, 0],
[ 0, 52, 0, 0, 70, 0, 0, 90, 0],
[ 0, 40, 0, 0, 57, 0, 0, 11, 0],
[ 0, 71, 0, 0, 33, 0, 0, 78, 0]])
In [120]: padcols(arr,2)
Out[120]:
array([[ 0, 0, 21, 0, 0, 0, 0, 14, 0, 0, 0, 0, 23, 0, 0],
[ 0, 0, 52, 0, 0, 0, 0, 70, 0, 0, 0, 0, 90, 0, 0],
[ 0, 0, 40, 0, 0, 0, 0, 57, 0, 0, 0, 0, 11, 0, 0],
[ 0, 0, 71, 0, 0, 0, 0, 33, 0, 0, 0, 0, 78, 0, 0]])
基准测试
在本节中,我将针对不同的填充长度,在运行时间和内存使用方面测试这篇文章中介绍的方法:padcols
和 @Kasramvd的解决方案函数:padder
。
时间性能分析
In [151]: arr = np.random.randint(10,99,(300,300))
In [152]: %timeit padder(arr,1)
100 loops, best of 3: 3.56 ms per loop
In [153]: %timeit padcols(arr,1)
100 loops, best of 3: 2.13 ms per loop
In [154]: %timeit padder(arr,2)
100 loops, best of 3: 5.82 ms per loop
In [155]: %timeit padcols(arr,2)
100 loops, best of 3: 3.66 ms per loop
In [156]: %timeit padder(arr,3)
100 loops, best of 3: 7.83 ms per loop
In [157]: %timeit padcols(arr,3)
100 loops, best of 3: 5.11 ms per loop
内存分析
这些内存测试使用的脚本是 -
import numpy as np
from memory_profiler import profile
arr = np.random.randint(10,99,(300,300))
padlen = 1
n = padlen
@profile(precision=10)
def padder():
x, y = arr.shape
indices = np.repeat(np.arange(y+1), n*2)[n:-n]
return np.insert(arr, indices, 0, axis=1)
@profile(precision=10)
def padcols():
N = 1+2*padlen
m,n = arr.shape
out = np.zeros((m,N*n),dtype=arr.dtype)
out[:,padlen+np.arange(n)*N] = arr
return out
if __name__ == '__main__':
padder()
if __name__ == '__main__':
padcols()
内存使用情况输出 -
案例 #1:
$ python -m memory_profiler timing_pads.py
Filename: timing_pads.py
Line
================================================
8 42.4492187500 MiB 0.0000000000 MiB @profile(precision=10)
9 def padder():
10 42.4492187500 MiB 0.0000000000 MiB x, y = arr.shape
11 42.4492187500 MiB 0.0000000000 MiB indices = np.repeat(np.arange(y+1), n*2)[n:-n]
12 44.7304687500 MiB 2.2812500000 MiB return np.insert(arr, indices, 0, axis=1)
Filename: timing_pads.py
Line
================================================
14 42.8750000000 MiB 0.0000000000 MiB @profile(precision=10)
15 def padcols():
16 42.8750000000 MiB 0.0000000000 MiB N = 1+2*padlen
17 42.8750000000 MiB 0.0000000000 MiB m,n = arr.shape
18 42.8750000000 MiB 0.0000000000 MiB out = np.zeros((m,N*n),dtype=arr.dtype)
19 44.6757812500 MiB 1.8007812500 MiB out[:,padlen+np.arange(n)*N] = arr
20 44.6757812500 MiB 0.0000000000 MiB return out
案例#2:
$ python -m memory_profiler timing_pads.py
Filename: timing_pads.py
Line
================================================
8 42.3710937500 MiB 0.0000000000 MiB @profile(precision=10)
9 def padder():
10 42.3710937500 MiB 0.0000000000 MiB x, y = arr.shape
11 42.3710937500 MiB 0.0000000000 MiB indices = np.repeat(np.arange(y+1), n*2)[n:-n]
12 46.2421875000 MiB 3.8710937500 MiB return np.insert(arr, indices, 0, axis=1)
Filename: timing_pads.py
Line
================================================
14 42.8476562500 MiB 0.0000000000 MiB @profile(precision=10)
15 def padcols():
16 42.8476562500 MiB 0.0000000000 MiB N = 1+2*padlen
17 42.8476562500 MiB 0.0000000000 MiB m,n = arr.shape
18 42.8476562500 MiB 0.0000000000 MiB out = np.zeros((m,N*n),dtype=arr.dtype)
19 46.1289062500 MiB 3.2812500000 MiB out[:,padlen+np.arange(n)*N] = arr
20 46.1289062500 MiB 0.0000000000 MiB return out
案例#3:
$ python -m memory_profiler timing_pads.py
Filename: timing_pads.py
Line
================================================
8 42.3906250000 MiB 0.0000000000 MiB @profile(precision=10)
9 def padder():
10 42.3906250000 MiB 0.0000000000 MiB x, y = arr.shape
11 42.3906250000 MiB 0.0000000000 MiB indices = np.repeat(np.arange(y+1), n*2)[n:-n]
12 47.4765625000 MiB 5.0859375000 MiB return np.insert(arr, indices, 0, axis=1)
Filename: timing_pads.py
Line
================================================
14 42.8945312500 MiB 0.0000000000 MiB @profile(precision=10)
15 def padcols():
16 42.8945312500 MiB 0.0000000000 MiB N = 1+2*padlen
17 42.8945312500 MiB 0.0000000000 MiB m,n = arr.shape
18 42.8945312500 MiB 0.0000000000 MiB out = np.zeros((m,N*n),dtype=arr.dtype)
19 47.4648437500 MiB 4.5703125000 MiB out[:,padlen+np.arange(n)*N] = arr
20 47.4648437500 MiB 0.0000000000 MiB return out