在C#中从List<T>中选择N个随机元素的算法

Question

在C#中从List<T>中选择N个随机元素的算法

c#algorithmrandomcollectionselement

4

我需要一个快速的算法来从一个通用列表中选择4个随机元素。例如，我想从一个List中获取4个随机元素，然后根据一些计算，如果找到的元素不合适，则应再从列表中选择下4个随机元素。

- techV

你不想从列表中选择 n 个随机元素，而是想要随机打乱列表中的项目，然后取前 n 个元素。（然后是下一个 n，然后是下一个，依此类推）——或者你真的想冒险再次获得先前获得的相同元素吗？ - Corak

1

能否允许选择同一个项目多次？ - Dmitry Bychenko

@MichalHainc - 实际上，“最佳”方法强烈取决于列表中有多少项以及预计要随机获取多少百分比。如果元素很多，但只想随机获取其中的一些，可以“记住”已经取出的索引（可能在HashSet<int>中），并随机生成n个不在该集合中的索引（生成后立即添加到集合中）。 - Corak

@MichalHainc 洗牌列表是O(n)操作 - 如何比从列表中删除随机项的任何其他方法更昂贵，其时间复杂度也是O(n)？ - Alexei Levenkov

建议将 https://dev59.com/tXVD5IYBdhLWcg3wOo5h 作为原始重复项（标题匹配），但我认为更一般的“洗牌”重复项（特别是可枚举版本 https://dev59.com/EnVC5IYBdhLWcg3wfxU8#7913534）涵盖更多情况，更适合取4而不是再取4。 - Alexei Levenkov

6个回答

1

类似这样的内容：

using System;
using System.Collections.Generic;

        public class Program
        {
            public static void Main()
            {
                var list = new List<int>();

                list.Add(1);
                list.Add(2);
                list.Add(3);
                list.Add(4);
                list.Add(5);

                int n = 4;

                var rand = new Random();

                var randomObjects = new List<int>();

                for (int i = 0; i<n; i++)
                {
                    var index = rand.Next(list.Count);

                    randomObjects.Add(list[index]);
                }       

            }
        }

- MKasprzyk

1

你可以这样做。

    public static class Extensions
    {
        public static Dictionary<int, T> GetRandomElements<T>(this IList<T> list, int quantity)
        {
            var result = new Dictionary<int, T>();
            if (list == null)
                return result;
            Random rnd = new Random(DateTime.Now.Millisecond);
            for (int i = 0; i < quantity; i++)
            {
                int idx = rnd.Next(0, list.Count);
                result.Add(idx, list[idx]);
            }
            return result;
        }
    }

然后像这样使用扩展方法：

    List<string> list = new List<string>() { "a", "b", "c", "d", "e", "f", "g", "h" };
    Dictionary<int, string> randomElements = list.GetRandomElements(3);
    foreach (KeyValuePair<int, string> elem in randomElements)
    {
        Console.WriteLine($"index in original list: {elem.Key} value: {elem.Value}");
    }

- michalh

1

如果你在方法内部声明 Random rnd = new Random(DateTime.Now.Millisecond); 并且在一个非常紧密的循环中频繁调用该方法，那么你有可能得到相同的“随机”元素。最好将其声明为 Extension 的静态字段并放在方法外部。此外，使用这种方法存在在多次调用中获得相同项的风险，甚至可能在同一次调用中出现。--问题是，在 OP 的情况下，这种行为是否可以接受。 - Corak

1

我认为这并没有太大帮助，因为DateTime的分辨率不够高（你会得到超过100个刻度的“跳跃”；这就是为什么建议在需要高精度且可用时使用Stopwatch）。只需拥有一个Random实例并重复使用它（永远不要“new”它），就可以完全避免这个问题。 - Corak

0

假设列表的长度为N。现在假设您将这4个数字放入另一个名为out的列表中。然后，您可以循环遍历该列表，并且您所在的元素被选择的概率为：

(4 - (out.Count)) / (N - currentIndex)

- Leonid

0

funcion (list)
(
loop i=0 i < 4
  index = (int) length(list)*random(0 -> 1)  
  element[i] = list[index] 
return element
) 

while(check  == false)
(
   elements = funcion (list)

    Do some calculation which returns check == false /true
)

这是伪代码，但我认为你应该自己想出来。希望能帮到你 :)

- Gijs Den Hollander

0

到目前为止，所有的答案都有一个根本性的缺陷；你正在寻找一个算法，它将生成一个由n个元素组成的随机组合，并且这个组合，在遵循某些逻辑规则的情况下，可以是有效的或无效的。如果不是有效的，则应该产生一个新的组合。显然，这个新的组合应该是从未产生过的。所有提出的算法都没有强制执行这一点。例如，如果在1000000个可能的组合中只有一个是有效的，那么您可能会浪费大量的资源，直到产生那个特定的唯一组合。

那么，如何解决这个问题呢？嗯，答案很简单，创建所有可能的独特解决方案，然后以随机顺序简单地产生它们。警告：我将假设输入流没有重复元素，如果有重复元素，则有些组合将不是唯一的。

首先，让我们编写一个便于使用的不可变栈：

class ImmutableStack<T> : IEnumerable<T>
{
    public static readonly ImmutableStack<T> Empty = new ImmutableStack<T>();
    private readonly T head;
    private readonly ImmutableStack<T> tail;
    public int Count { get; }

    private ImmutableStack()
    {
        Count = 0;
    }

    private ImmutableStack(T head, ImmutableStack<T> tail)
    {
        this.head = head;
        this.tail = tail;
        Count = tail.Count + 1;
    }

    public T Peek()
    {
        if (this == Empty)
            throw new InvalidOperationException("Can not peek a empty stack.");

        return head;
    }

    public ImmutableStack<T> Pop()
    {
        if (this == Empty)
            throw new InvalidOperationException("Can not pop a empty stack.");

        return tail;
    }

    public ImmutableStack<T> Push(T item) => new ImmutableStack<T>(item, this);

    public IEnumerator<T> GetEnumerator()
    {
        var current = this;

        while (current != Empty)
        {
            yield return current.head;
            current = current.tail;
        }
    }

    IEnumerator IEnumerable.GetEnumerator() => GetEnumerator();
}

通过递归生成所有组合将使我们的生活更加轻松。接下来，让我们正确获取主方法的签名：

public static IEnumerable<IEnumerable<T>> GetAllPossibleCombinationsInRandomOrder<T>(
    IEnumerable<T> data, int combinationLength)

好的，看起来差不多了。现在让我们实现这个东西：

    var allCombinations = GetAllPossibleCombinations(data, combinationLength).ToArray();
    var rnd = new Random();
    var producedIndexes = new HashSet<int>();

    while (producedIndexes.Count < allCombinations.Length)
    {
        while (true)
        {
            var index = rnd.Next(allCombinations.Length);

            if (!producedIndexes.Contains(index))
            {
                producedIndexes.Add(index);
                yield return allCombinations[index];
                break;
            }
        }
    }

好的，我们在这里所做的就是产生随机的索引，检查我们是否已经生成过了（我们使用HashSet<int>来实现此功能），然后返回该索引处的组合。

很简单，现在我们只需要处理GetAllPossibleCombinations(data, combinationLength)。

那很容易，我们将使用递归。我们的退出条件是当前组合达到指定长度。另一个要注意的地方是：我在整个代码中省略了参数验证，比如检查null或指定的长度是否大于输入长度等等，应该注意这些问题。

为了好玩，我将在这里使用一些小的C#7语法：嵌套函数。

public static IEnumerable<IEnumerable<T>> GetAllPossibleCombinations<T>(
    IEnumerable<T> stream, int length)
{
    return getAllCombinations(stream, ImmutableStack<T>.Empty);

    IEnumerable<IEnumerable<T>> getAllCombinations<T>(IEnumerable<T> currentData, ImmutableStack<T> combination)
    {
        if (combination.Count == length)
            yield return combination;

        foreach (var d in currentData)
        {
            var newCombination = combination.Push(d);

            foreach (var c in getAllCombinations(currentData.Except(new[] { d }), newCombination))
            {
                yield return c;
            }
        }
    }
}

好的，现在我们可以使用它：

var data = "abc";
var random = GetAllPossibleCombinationsInRandomOrder(data, 2);

foreach (var r in random)
{
    Console.WriteLine(string.Join("", r));
}

果然，输出结果为：

bc
cb
ab
ac
ba
ca

- InBetween

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Roman · Accepted Answer

你可以将索引存储在某个列表中，以获取不重复的索引：

List<T> GetRandomElements<T>(List<T> allElements, int randomCount = 4)
{
    if (allElements.Count < randomCount)
    {
        return allElements;
    }

    List<int> indexes = new List<int>();

    // use HashSet if performance is very critical and you need a lot of indexes
    //HashSet<int> indexes = new HashSet<int>(); 

    List<T> elements = new List<T>();

    Random random = new Random(); 
    while (indexes.Count < randomCount)
    {
        int index = random.Next(allElements.Count);
        if (!indexes.Contains(index))
        {
            indexes.Add(index);
            elements.Add(allElements[index]);
        }
    }

    return elements;
}

然后您可以进行一些计算并调用此方法：

void Main(String[] args)
{
    do
    {
        List<int> elements = GetRandomelements(yourElements);
        //do some calculations
    } while (some condition); // while result is not right
}