这是一个样本切割器,可以从任意维度的数组中创建样本切片。它使用函数来控制在任何轴上切割的起始位置和切割的宽度。
以下是参数的解释:
- `arr` - 输入的numpy数组。
- `loc_sampler_fn` - 这是您想要用来设置盒子角落的函数。如果您想要从轴的任何位置均匀采样盒子的角落,请使用`np.random.uniform`。如果您希望角落靠近数组的中心,请使用`np.random.normal`。但是,我们需要告诉函数要在哪个范围内进行采样。这就带我们到下一个参数。
- `loc_dim_param` - 这将每个轴的大小传递给`loc_sampler_fn`。如果我们在位置采样器中使用`np.random.uniform`,我们希望从整个轴的范围内采样。`np.random.uniform`有两个参数:`low`和`high`,因此通过将轴的长度传递给`high`,它会在整个轴上均匀采样。换句话说,如果轴的长度为`120`,我们希望`np.random.uniform(low=0, high=120)`,因此我们设置`loc_dim_param='high'`。
- `loc_params` - 这将任何其他参数传递给`loc_sampler_fn`。根据示例,我们需要向`np.random.uniform`传递`low=0`,因此我们传递字典`loc_params={'low':0}`。
从这里开始,对于盒子的形状基本上是相同的。如果您希望从3到10均匀采样盒子的高度和宽度,请传入`shape_sampler_fn=np.random.uniform`,并将`shape_dim_param=None`传递给它,因为我们没有使用轴的大小做任何事情,以及`shape_params={'low':3, 'high':11}`。
def box_sampler(arr,
loc_sampler_fn,
loc_dim_param,
loc_params,
shape_sampler_fn,
shape_dim_param,
shape_params):
'''
Extracts a sample cut from `arr`.
Parameters:
-----------
loc_sampler_fn : function
The function to determine the where the minimum coordinate
for each axis should be placed.
loc_dim_param : string or None
The parameter in `loc_sampler_fn` that should use the axes
dimension size
loc_params : dict
Parameters to pass to `loc_sampler_fn`.
shape_sampler_fn : function
The function to determine the width of the sample cut
along each axis.
shape_dim_param : string or None
The parameter in `shape_sampler_fn` that should use the
axes dimension size.
shape_params : dict
Parameters to pass to `shape_sampler_fn`.
Returns:
--------
(slices, x) : A tuple of the slices used to cut the sample as well as
the sampled subsection with the same dimensionality of arr.
slice :: list of slice objects
x :: array object with the same ndims as arr
'''
slices = []
for dim in arr.shape:
if loc_dim_param:
loc_params.update({loc_dim_param: dim})
if shape_dim_param:
shape_params.update({shape_dim_param: dim})
start = int(loc_sampler_fn(**loc_params))
stop = start + int(shape_sampler_fn(**shape_params))
slices.append(slice(start, stop))
return slices, arr[slices]
一个在宽度为3到9之间的二维数组上进行均匀切割的例子:
a = np.random.randint(0, 1+1, size=(100,150))
box_sampler(a,
np.random.uniform, 'high', {'low':0},
np.random.uniform, None, {'low':3, 'high':10})
([slice(49, 55, None), slice(86, 89, None)],
array([[0, 0, 1],
[0, 1, 1],
[0, 0, 0],
[0, 0, 1],
[1, 1, 1],
[1, 1, 0]]))
从一个10x20x30的三维数组中提取2x2x2块的示例:
a = np.random.randint(0,2,size=(10,20,30))
box_sampler(a, np.random.uniform, 'high', {'low':0},
np.random.uniform, None, {'low':2, 'high':2})
([slice(7, 9, None), slice(9, 11, None), slice(19, 21, None)],
array([[[0, 1],
[1, 0]],
[[0, 1],
[1, 1]]]))
根据评论更新。
针对您的具体目的,看起来您需要一个矩形样本,其中起始角落从数组中任何位置均匀采样,并且沿每个轴的样本宽度是均匀采样的,但可以受到限制。
这里有一个生成这些样本的函数。min_width
和max_width
可以接受整数的可迭代对象(例如元组)或单个整数。
def uniform_box_sampler(arr, min_width, max_width):
'''
Extracts a sample cut from `arr`.
Parameters:
-----------
arr : array
The numpy array to sample a box from
min_width : int or tuple
The minimum width of the box along a given axis.
If a tuple of integers is supplied, it my have the
same length as the number of dimensions of `arr`
max_width : int or tuple
The maximum width of the box along a given axis.
If a tuple of integers is supplied, it my have the
same length as the number of dimensions of `arr`
Returns:
--------
(slices, x) : A tuple of the slices used to cut the sample as well as
the sampled subsection with the same dimensionality of arr.
slice :: list of slice objects
x :: array object with the same ndims as arr
'''
if isinstance(min_width, (tuple, list)):
assert len(min_width)==arr.ndim, 'Dimensions of `min_width` and `arr` must match'
else:
min_width = (min_width,)*arr.ndim
if isinstance(max_width, (tuple, list)):
assert len(max_width)==arr.ndim, 'Dimensions of `max_width` and `arr` must match'
else:
max_width = (max_width,)*arr.ndim
slices = []
for dim, mn, mx in zip(arr.shape, min_width, max_width):
fn = np.random.uniform
start = int(np.random.uniform(0,dim))
stop = start + int(np.random.uniform(mn, mx+1))
slices.append(slice(start, stop))
return slices, arr[slices]
生成一个盒子的示例,该盒子从数组中任意位置开始均匀地切割,高度是从1到4的随机均匀抽取,宽度是从2到6的随机均匀抽取(仅供展示)。在这种情况下,盒子的大小为3乘4,从第66行和第19列开始。
x = np.random.randint(0,2,size=(100,100))
uniform_box_sampler(x, (1,2), (4,6))
([slice(65, 68, None), slice(18, 22, None)],
array([[1, 0, 0, 0],
[0, 0, 1, 1],
[0, 1, 1, 0]]))