将数组分成一个子序列数组的数组

11

我有一个字节数组:

byte[] bytes;  // many elements

我需要将其分成X个元素的字节数组子序列。例如,x = 4。

如果bytes.Length不能被X整除,则将0添加到最后一个子序列数组,以便所有子序列的长度必须为X

Linq可用。

PS:我的尝试

static void Main(string[] args)
{
    List<byte> bytes = new List<byte>() { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 };

    int c = bytes.Count / 4;

    for (int i = 0; i <= c; i+=4)
    {
        int diff = bytes.Count - 4;

        if (diff < 0)
        {

        }
        else
        {
            List<byte> b = bytes.GetRange(i, 4);
        }
    }

    Console.ReadKey();
}
13个回答

31

这很可爱:

static class ChunkExtension
{
    public static IEnumerable<T[]> Chunkify<T>(
        this IEnumerable<T> source, int size)
    {
        if (source == null) throw new ArgumentNullException("source");
        if (size < 1) throw new ArgumentOutOfRangeException("size");
        using (var iter = source.GetEnumerator())
        {
            while (iter.MoveNext())
            {
                var chunk = new T[size];
                chunk[0] = iter.Current;
                for (int i = 1; i < size && iter.MoveNext(); i++)
                {
                    chunk[i] = iter.Current;
                }
                yield return chunk;
            }
        }
    }
}
static class Program
{
    static void Main(string[] args)
    {
        List<byte> bytes = new List<byte>() {
              1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 };
        var chunks = bytes.Chunkify(4);
        foreach (byte[] chunk in chunks)
        {
            foreach (byte b in chunk) Console.Write(b.ToString("x2") + " ");
            Console.WriteLine();
        }
    }
}

1
可爱的 :) - Chris McCall
不错。但是,当“source”中的元素数量不是“size”的倍数时,请注意未初始化的数组元素。 - mkoertgen
2
简单的修复:在循环外声明“int i”,如果“i”小于“size”,则使用Array.Resize(...)。 - mkoertgen
我会将那个丑陋的 for 循环转换为 do .. whilevar chunk = new T[size]; var i = 0; do { chunk[i] = iter.Current; i++; } while (i < size && iter.MoveNext()); yield return chunk; - hIpPy
手动迭代列表(如Chunikfy实现中)是否比使用foreach有任何好处? - wischi
@wischi 是的:代码方便;尝试使用foreach重写上面的代码:由于两个相关循环(数据与数组)的存在,它变得混乱;对于一个真正好的例子:Zip是一个好例子。 - Marc Gravell

7

如果您始终得到source.Length % size != 0,则投票答案有效,尽管它过于冗长。这里有一个更好的实现:

public static IEnumerable<T[]> AsChunks<T>(IEnumerable<T> source, int size)
{
    var chunk = new T[size];
    var i = 0;
    foreach(var e in source)
    {
        chunk[i++] = e;
        if (i==size)
        {
            yield return chunk;
            i=0;
        }
    }
    if (i>0) // Anything left?
    {
        Array.Resize(ref chunk, i);
        yield return chunk;
    }
}

void Main()
{
    foreach(var chunk in AsChunks("Hello World!",5))
        Console.WriteLine(new string(chunk));
}

生成:

  1. 你好
  2. 世界

3
这个很好地完成了它:

这样做非常好:

    public static IEnumerable<IEnumerable<T>> GetBatches<T>(this IEnumerable<T> items, int batchsize) {
        var itemsCopy = items;
        while (itemsCopy.Any()) {
            yield return itemsCopy.Take(batchsize);
            itemsCopy = itemsCopy.Skip(batchsize);
        }
    }

3
这个怎么样:
var bytes = new List<byte>() { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 };

var result = Chunkify(bytes, 4);

IEnumerable<IEnumerable<T>> Chunkify<T>(IEnumerable<T> source, int chunkSize)
{
    var indicies = 
        Enumerable.Range(0, source.Count()).Where(i => i%chunkSize==0);

    var chunks = 
            indicies
            .Select( i => source.Skip(i).Take(chunkSize) )
            .Select( chunk => new { Chunk=chunk, Count=chunk.Count() } )
            .Select( c => c.Count < chunkSize ? c.Chunk.Concat( Enumerable.Repeat( default(T), chunkSize - c.Count ) ) : c.Chunk )
            ;

    return chunks;      
}

2
请注意,这将枚举source多次。例如,如果它是Linq到SQL查询,则可能会执行SQL查询数百次!编写此类IEnumerable<T>方法时,最好只枚举序列一次。查看此实现以了解我的意思。OP正在询问一个字节的材料化集合,在这种情况下不是问题,但其他访问此问题的人可能需要注意这个区别。 - Drew Noakes

1
    const int x = 4;
var bytes = new List<byte>() { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 };
var groups = bytes.Select((b, index) => new { b, index }).GroupBy(obj => obj.index / x).Select(group => new List<byte>(group.Select(i => i.b)));
var last = groups.Last();   
while (last.Count < x)
{
    last.Add(0);
}

1
一个好的解决方案,但请注意它被强制缓存整个序列,这在大多数常见情况下可能完全没问题。 - Marc Gravell

1
你可以尝试这个:
    List<byte> bytes = new List<byte>() { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 };

    int partLength = 4;
    int c = bytes.Count / partLength;

    if((c % partLength) != 0)
        c++; // we need one last list which will have to be filled with 0s

    List<List<byte>> allLists = new List<List<byte>>();

    for (int i = 0; i <= c; i++)
        allLists.Add(bytes.Take(partLength).ToList());

    int zerosNeeded = partLength - allLists.Last().Count;

    for (int i = 0; i < zerosNeeded; i++)
        allLists.Last().Add(0);

如果有任何不清楚的地方,请问。


1

当然,您会想采用Marc Gravell的解决方案,但我忍不住要拼凑出一个纯LINQ版本,只是为了看看是否可以做到:

static IEnumerable<T[]> LinqChunks<T>(IEnumerable<T> input, int chunkSize)
{
  return input
    //assign chunk numbers to elements by integer division
    .Select((x, index) => new {ChunkNr = index / chunkSize, Value = x})

    //group by chunk number
    .GroupBy(item => item.ChunkNr)

    //convert chunks to arrays, and pad with zeroes if necessary
    .Select(group =>
              {
                var block = group.Select(item => item.Value).ToArray();

                //if block size = chunk size -> return the block
                if (block.Length == chunkSize) return block;

                //if block size < chunk size -> this is the last block, pad it
                var lastBlock= new T[chunkSize];
                for (int i = 0; i < block.Length; i++) lastBlock[i] = block[i];
                return lastBlock;
              });
}

1

如果有人想要纯函数式的解决方案 -

static IEnumerable<T[]> Chunkify<T>(IEnumerable<T> input, int size)
{
    return input    
        .Concat(Enumerable.Repeat(default(T), size - input.Count() % size))
        .Select((x, i) => new { Value = x, Chunk = i / size })
        .GroupBy(x => x.Chunk, x => x.Value)
        .Select(x => x.ToArray());
}

1
/// <summary>
/// Splits an array of bytes into a List<byte[]> holding the
/// chunks of the original array. If the size of the chunks is bigger than
/// the array it will return the original array to be split.
/// </summary>
/// <param name="array">The array to split</param>
/// <param name="size">the size of the chunks</param>
/// <returns></returns>
public static List<byte[]> SplitArray(byte[] array, int size)
{
    List<byte[]> chunksList = new List<byte[]>();
    int skipCounter = 0;

    while (skipCounter < array.Length)
    {
        byte[] chunk = array.Skip(skipCounter).Take(size).ToArray<byte>();
        chunksList.Add(chunk);
        skipCounter += chunk.Length;
    }
    return chunksList;
}

0

这个答案更适用于IEnumerable的情况,但问题被标记为重复。

有很多解决方案,但对我来说都不够懒惰。这个解决方案可以解决问题:

  private class CachedEnumeration<T> : IEnumerable<T>  
  {  
    /// <summary>  
    /// enumerator for the cachedEnumeration class  
    /// </summary>  
    class CachedEnumerator : IEnumerator<T>  
    {  
      private readonly CachedEnumeration<T> m_source;  
      private int m_index;  
      public CachedEnumerator(CachedEnumeration<T> source)  
      {  
        m_source = source;  
        // start at index -1, since an enumerator needs to start with MoveNext before calling current  
        m_index = -1;  
      }  
      public T Current { get { return m_source.m_items[m_index]; } }  
      public void Dispose() { }  
      object System.Collections.IEnumerator.Current { get { return Current; } } 
      public bool MoveNext()  
      {  
        // if we have cached items, just increase our index  
        if (m_source.m_items.Count > m_index + 1)  
        {  
          m_index++;  
          return true;  
        }  
        else 
        {  
          var result = m_source.FetchOne();  
          if (result) m_index++;  
          return result;  
        }  
      }  
      public void Reset()  
      {  
        m_index = -1;  
      }  
    }  
    /// <summary>  
    /// list containing all the items  
    /// </summary>  
    private readonly List<T> m_items;  
    /// <summary>  
    /// callback how to fetch an item  
    /// </summary>  
    private readonly Func<Tuple<bool, T>> m_fetchMethod;  
    private readonly int m_targetSize;  
    public CachedEnumeration(int size, T firstItem, Func<Tuple<bool, T>> fetchMethod)  
    {  
      m_items = new List<T>(size);  
      m_items.Add(firstItem);  
      m_fetchMethod = fetchMethod;  
      m_targetSize = size;  
    }  
    public IEnumerator<T> GetEnumerator()  
    {  
      return new CachedEnumerator(this);  
    }  
    System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator()  
    {  
      return GetEnumerator();  
    }  
    private bool FetchOne()  
    {  
      if (IsFull) return false;  
      var result = m_fetchMethod();  
      if (result.Item1) m_items.Add(result.Item2);  
      return result.Item1;  
    }  
    /// <summary>  
    /// fetches all items to the cached enumerable  
    /// </summary>  
    public void FetchAll()  
    {  
      while (FetchOne()) { }  
    }  
    /// <summary>  
    /// tells weather the enumeration is already full  
    /// </summary>  
    public bool IsFull { get { return m_targetSize == m_items.Count; } }  
  }  
  /// <summary>  
  /// partitions the <paramref name="source"/> to parts of size <paramref name="size"/>  
  /// </summary>  
  public static IEnumerable<IEnumerable<T>> Partition<T>(this IEnumerable<T> source, int size)  
  {  
    if (source == null) throw new ArgumentNullException("source");  
    if (size < 1) throw new ArgumentException(string.Format("The specified size ({0}) is invalid, it needs to be at least 1.", size), "size");  
    var enumerator = source.GetEnumerator();  
    while (enumerator.MoveNext())  
    {  
      var lastResult = new CachedEnumeration<T>(size, enumerator.Current, () => Tuple.Create(enumerator.MoveNext(), enumerator.Current));  
      yield return lastResult;  
      lastResult.FetchAll();  
    }  
  }  

你可以在这里找到单元测试和源代码链接


网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接