Tensorflow中的numpy随机选择

Question

Tensorflow中的numpy随机选择

28

在Tensorflow中是否有与numpy random choice相当的函数？在numpy中，我们可以根据给定列表的权重随机获取一个项目。

 np.random.choice([1,2,3,5], 1, p=[0.1, 0, 0.3, 0.6, 0])

该代码将根据给定的p权重从列表中选择一个项目。

- seleucia

数值错误：'a' 和 'p' 必须具有相同的大小 - mon

10个回答

10

在tensorflow 2.0中，tf.compat.v1.multinomial已经被弃用，请使用tf.random.categorical。

- Arvind

6

我和我的团队在保持所有操作为tensorflow操作并实现“无替换”版本的要求上遇到了同样的问题。

解决方案：

def tf_random_choice_no_replacement_v1(one_dim_input, num_indices_to_drop=3):

    input_length = tf.shape(one_dim_input)[0]

    # create uniform distribution over the sequence
    # for tf.__version__<1.11 use tf.random_uniform - no underscore in function name
    uniform_distribution = tf.random.uniform(
        shape=[input_length],
        minval=0,
        maxval=None,
        dtype=tf.float32,
        seed=None,
        name=None
    )

    # grab the indices of the greatest num_words_to_drop values from the distibution
    _, indices_to_keep = tf.nn.top_k(uniform_distribution, input_length - num_indices_to_drop)
    sorted_indices_to_keep = tf.contrib.framework.sort(indices_to_keep)

    # gather indices from the input array using the filtered actual array
    result = tf.gather(one_dim_input, sorted_indices_to_keep)
    return result

这段代码的思想是生成一个随机均匀分布，其维度等于向量的维度，您可以通过获取排名前 k 的位置的索引作为选择来实现选择操作。由于顶部 k 的位置与随机均匀分布一样随机，因此等同于进行无重复随机选择。这可以用于在TensorFlow中对任何1D序列执行选择操作。

- PaulG

OP的问题并不假设均匀分布。每个选择的概率都已经指定。 - Brenden Petersen

1

谢谢您的回答！对于我正在进行的一个项目非常有帮助，我也在寻找一个替代 np.random.choice 的 tensorflow 方法，并且不允许重复选择。 - Daniel Mewes

4

这是np.random.choice和tf.random.categorical的并排比较及示例。

N = np.random.choice([0,1,2,3,4], 5000, p=[i/sum(range(1,6)) for i in range(1,6)])
plt.hist(N, density=True, bins=5)
plt.grid()

T = tf.random.categorical(tf.math.log([[i/sum(range(1,6)) for i in range(1,6)]]), 5000)
# T = tf.random.categorical([[i/sum(range(1,6)) for i in range(1,6)]], 1000)
plt.hist(T, density=True, bins=5)
plt.grid()

这里有另一种实现方法。

    def random_choice(a, size):
        """Random choice from 'a' based on size without duplicates
        Args:
            a: Tensor
            size: int or shape as a tuple of ints e.g., (m, n, k).
        Returns: Tensor of the shape specified with 'size' arg.

        Examples:
            X = tf.constant([[1,2,3],[4,5,6]])
            random_choice(X, (2,1,2)).numpy()
            -----
            [
              [
                [5 4]
              ],
              [
                [1 2]
              ]
            ]
        """
        if isinstance(size, int) or np.issubdtype(type(a), np.integer) or (tf.is_tensor(a) and a.shape == () and a.dtype.is_integer):
            shape = (size,)
        elif isinstance(size, tuple) and len(size) > 0:
            shape = size
        else:
            raise AssertionError(f"Unexpected size arg {size}")

        sample_size = tf.math.reduce_prod(size, axis=None)
        assert sample_size > 0

        # --------------------------------------------------------------------------------
        # Select elements from a flat array
        # --------------------------------------------------------------------------------
        a = tf.reshape(a, (-1))
        length = tf.size(a)
        assert sample_size <= length

        # --------------------------------------------------------------------------------
        # Shuffle a sequential numbers (0, ..., length-1) and take size.
        # To select 'sample_size' elements from a 1D array of shape (length,),
        # TF Indices needs to have the shape (sample_size,1) where each index
        # has shape (1,),
        # --------------------------------------------------------------------------------
        indices = tf.reshape(
            tensor=tf.random.shuffle(tf.range(0, length, dtype=tf.int32))[:sample_size],
            shape=(-1, 1)   # Convert to the shape:(sample_size,1)
        )
        return tf.reshape(tensor=tf.gather_nd(a, indices), shape=shape)

X = tf.constant([[1,2,3],[4,5,6]])
print(random_choice(X, (2,2,1)).numpy())
---
[[[5]
  [4]]

 [[2]
  [1]]]

- mon

3

虽然来晚了，但我找到了最简单的解决方法。

#sample source matrix
M = tf.constant(np.arange(4*5).reshape(4,5))
N_samples = 2
tf.gather(M, tf.cast(tf.random.uniform([N_samples])*M.shape[0], tf.int32), axis=0)

- t.okuda

2

如果您希望从n维张量中随机抽取行而不是从1维张量中随机抽取元素，则可以结合使用tf.multinomial和tf.gather。

def _random_choice(inputs, n_samples):
    """
    With replacement.
    Params:
      inputs (Tensor): Shape [n_states, n_features]
      n_samples (int): The number of random samples to take.
    Returns:
      sampled_inputs (Tensor): Shape [n_samples, n_features]
    """
    # (1, n_states) since multinomial requires 2D logits.
    uniform_log_prob = tf.expand_dims(tf.zeros(tf.shape(inputs)[0]), 0)

    ind = tf.multinomial(uniform_log_prob, n_samples)
    ind = tf.squeeze(ind, 0, name="random_choice_ind")  # (n_samples,)

    return tf.gather(inputs, ind, name="random_choice")

- twink_ml

1

虽然有现成的tf.multinomial方法，但它会占用大量临时内存，无法处理大型输入。我来晚了，但是我想添加另一种解决方案。以下是我使用的方法（适用于TF 2.0）：

# Sampling k members of 1D tensor a using weights w
cum_dist = tf.math.cumsum(w)
cum_dist /= cum_dist[-1]  # to account for floating point errors
unif_samp = tf.random.uniform((k,), 0, 1)
idxs = tf.searchsorted(cum_dist, unif_samp)
samp = tf.gather(a, idxs)  # samp contains the k weighted samples

- Tim Hargreaves

有没有可能将此应用于形状为[batch，timesteps，probabilities]的张量？其中每个时间步都分配了少量概率，用于混合密度网络，例如[0.2,0.1,0.7]。要求是从另一个相同形状的张量中每行选择一个结果为[batch，timesteps，1]。 - JP K.

0

您可以通过以下方式实现相同的结果

g = tf.random.Generator.from_seed(123) l = [1, 2, 3] print(l[tf.squeeze(g.uniform(shape=[1], minval=0, maxval=3, dtype=tf.dtypes.int32))])

- Ajinkya Pol

你的答案可以通过提供更多支持信息来改进。请[编辑]添加进一步的细节，例如引用或文档，以便其他人可以确认你的答案是正确的。您可以在帮助中心找到有关如何编写良好答案的更多信息。 - Community

0

我在尝试解决一个类似的问题时注意到了这个，但需要比这里提供的答案更灵活。

import tensorflow as tf


def select_indices_with_replacement(probabilities, num_indices):
    # Convert the probabilities to a tensor and normalize them, so they sum to 1
    probabilities = tf.constant(probabilities, dtype=tf.float32)
    probabilities /= tf.reduce_sum(probabilities)

    # Flatten the probability tensor so it has a single dimension
    shape = tf.constant([1, tf.reduce_prod(probabilities.get_shape()).numpy()])
    flat_probs = tf.reshape(probabilities, shape)

    # Use the categorical distribution function in TensorFlow to sample indices
    # based on the probabilities
    index = tf.random.categorical(tf.math.log(flat_probs), num_indices, dtype=tf.int32)

    indices = tf.unravel_index(index[0], probabilities.shape)
    return indices


def select_indices_no_replacement(probabilities, num_indices):
    # Convert the probabilities to a tensor and normalize them, so they sum to 1
    probabilities = tf.constant(probabilities, dtype=tf.float32)
    probabilities /= tf.reduce_sum(probabilities)

    # Flatten the probability tensor so it has a single dimension
    shape = tf.constant([1, tf.reduce_prod(probabilities.get_shape()).numpy()])
    flat_probs = tf.reshape(probabilities, shape)

    # Create a tensor of the same shape as the probability tensor, but with all
    # elements set to False
    selected = tf.zeros_like(probabilities, dtype=tf.bool)

    # Use a loop to sample indices without replacement
    indices = []
    for _ in range(num_indices):
        # Use the categorical distribution function to sample an index based on
        # the remaining probabilities
        index = tf.random.categorical(tf.math.log(flat_probs), 1, dtype=tf.int32)
        index = index[0, 0]

        # Convert the flat index to a tuple of indices for the original ND tensor
        indices.append(tf.unravel_index(index, probabilities.shape))
        # indices[-1] = indices[-1].numpy()  # Comment out to leave wrapped in tensorflow

        # Set the probability of the selected index to 0 to ensure it is not
        # selected again
        flat_probs = tf.tensor_scatter_nd_update(flat_probs, [[0, index]], [0.0])

        # Set the selected element to True
        selected = tf.tensor_scatter_nd_update(selected, [indices[-1]], [True])

    indices = tf.transpose(indices)
    return indices


def select_indices(probabilities, num_indices, replace=True):
    if replace:
        return select_indices_with_replacement(probabilities, num_indices)
    else:
        return select_indices_no_replacement(probabilities, num_indices)


def main():
    # Example usage
    probabilities = [[[0.1, 0.2, 0.3, 0.4], [0.4, 0.3, 0.2, 0.1]]]
    num_indices = 8
    indices = select_indices(probabilities, num_indices, replace=False)
    print(indices)


if __name__ == "__main__":
    main()

当 replace=False 时：

tf.Tensor(
[[0 0 0 0 0 0 0 0]
 [1 1 0 0 0 1 0 1]
 [2 0 1 2 3 1 0 3]], shape=(3, 8), dtype=int32)

当 replace=True 时：

tf.Tensor(
[[0 0 0 0 0 0 0 0]
 [1 0 0 1 1 0 1 0]
 [3 3 3 2 0 3 3 1]], shape=(3, 8), dtype=int32)

==编辑==

如果您不需要担心自定义分布并且满意于简单的均匀分布，则可以使用以下内容，而无需生成概率矩阵（仅需其维度）：

def select_indices_uniform(dims, num_indices, unique=False):
    # Create a tensor of probabilities
    size = tf.reduce_prod(tf.constant(dims, dtype=tf.int32)).numpy()

    # Use the uniform_candidate_sampler function to sample indices
    samples = tf.random.log_uniform_candidate_sampler(
            true_classes=[[0]], num_true=1, num_sampled=num_indices, unique=unique, range_max=size)

    # Extract the data and return it
    indices = tf.unravel_index(samples[0], dims)
    return indices

- user3303504

-10

在tf中没有必要进行随机选择，你可以直接使用np.random.choice(data, p=probs)进行选择，tf可以接受。

- Jiabo He

8

但是它并没有运行在GPU上。 - Raphael Schumann

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- sygi · Accepted Answer

不，但是你可以使用tf.multinomial来实现相同的结果：

elems = tf.convert_to_tensor([1,2,3,5])
samples = tf.multinomial(tf.log([[1, 0, 0.3, 0.6]]), 1) # note log-prob
elems[tf.cast(samples[0][0], tf.int32)].eval()
Out: 1
elems[tf.cast(samples[0][0], tf.int32)].eval()
Out: 5

这里是[0][0]部分，因为multinomial希望每个批次元素的一行未标准化的对数概率，并且还有另一个维度用于样本数量。