下午好,
我需要将一个数组分成较小的“块”。
我正在传递大约1200个项目,并需要将它们分成每个100个项目更容易处理的数组,然后我需要对其进行处理。
请问有人可以提出一些建议吗?
下午好,
我需要将一个数组分成较小的“块”。
我正在传递大约1200个项目,并需要将它们分成每个100个项目更容易处理的数组,然后我需要对其进行处理。
请问有人可以提出一些建议吗?
Array.Copy自1.1版本以来就存在了,可以很好地分块数组。
List.GetRange()也是另一个答案中提到的不错选择。
string[] buffer;
for(int i = 0; i < source.Length; i+=100)
{
buffer = new string[100];
Array.Copy(source, i, buffer, 0, 100);
// process array
}
并为其创建一个扩展:
public static class Extensions
{
public static T[] Slice<T>(this T[] source, int index, int length)
{
T[] slice = new T[length];
Array.Copy(source, index, slice, 0, length);
return slice;
}
}
而且使用该扩展:
string[] source = new string[] { 1200 items here };
// get the first 100
string[] slice = source.Slice(0, 100);
ArraySegment<>
。无需进行性能检查,因为它只是使用原始数组作为其源并维护偏移量和计数属性来确定“段”。不幸的是,没有一种方法可以仅检索段作为数组,因此一些人编写了包装器,例如这里:ArraySegment - Returning the actual segment C#
ArraySegment<string> segment;
for (int i = 0; i < source.Length; i += 100)
{
segment = new ArraySegment<string>(source, i, 100);
// and to loop through the segment
for (int s = segment.Offset; s < segment.Array.Length; s++)
{
Console.WriteLine(segment.Array[s]);
}
}
测试方法(在Release模式下):
static void Main(string[] args)
{
string[] source = new string[1000000];
for (int i = 0; i < source.Length; i++)
{
source[i] = "string " + i.ToString();
}
string[] buffer;
Console.WriteLine("Starting stop watch");
Stopwatch sw = new Stopwatch();
for (int n = 0; n < 5; n++)
{
sw.Reset();
sw.Start();
for (int i = 0; i < source.Length; i += 100)
{
buffer = new string[100];
Array.Copy(source, i, buffer, 0, 100);
}
sw.Stop();
Console.WriteLine("Array.Copy: " + sw.ElapsedMilliseconds.ToString());
sw.Reset();
sw.Start();
for (int i = 0; i < source.Length; i += 100)
{
buffer = new string[100];
buffer = source.Skip(i).Take(100).ToArray();
}
sw.Stop();
Console.WriteLine("Skip/Take: " + sw.ElapsedMilliseconds.ToString());
sw.Reset();
sw.Start();
String[][] chunks = source
.Select((s, i) => new { Value = s, Index = i })
.GroupBy(x => x.Index / 100)
.Select(grp => grp.Select(x => x.Value).ToArray())
.ToArray();
sw.Stop();
Console.WriteLine("LINQ: " + sw.ElapsedMilliseconds.ToString());
}
Console.ReadLine();
}
结果(毫秒):
Array.Copy: 15
Skip/Take: 42464
LINQ: 881
Array.Copy: 21
Skip/Take: 42284
LINQ: 585
Array.Copy: 11
Skip/Take: 43223
LINQ: 760
Array.Copy: 9
Skip/Take: 42842
LINQ: 525
Array.Copy: 24
Skip/Take: 43134
LINQ: 638
Slice()
之类的东西)。 - James Michael Hare您可以使用 LINQ
将所有项按块大小进行分组,然后创建新的数组。
// build sample data with 1200 Strings
string[] items = Enumerable.Range(1, 1200).Select(i => "Item" + i).ToArray();
// split on groups with each 100 items
String[][] chunks = items
.Select((s, i) => new { Value = s, Index = i })
.GroupBy(x => x.Index / 100)
.Select(grp => grp.Select(x => x.Value).ToArray())
.ToArray();
for (int i = 0; i < chunks.Length; i++)
{
foreach (var item in chunks[i])
Console.WriteLine("chunk:{0} {1}", i, item);
}
请注意,创建新数组(会耗费cpu周期和内存)不是必要的。当省略两个ToArrays
时,您也可以使用IEnumerable<IEnumerable<String>>
。
下面是运行代码:http://ideone.com/K7Hn2
这里我发现了另一种 LINQ 解决方案:
int[] source = new[] { 1, 2, 3, 4, 5, 6, 7, 8, 9 };
int i = 0;
int chunkSize = 3;
int[][] result = source.GroupBy(s => i++ / chunkSize).Select(g => g.ToArray()).ToArray();
//result = [1,2,3][4,5,6][7,8,9]
Skip()
和 Take()
。string[] items = new string[]{ "a", "b", "c"};
string[] chunk = items.Skip(1).Take(1).ToArray();
string[] amzProductAsins = GetProductAsin();;
List<string[]> chunks = new List<string[]>();
for (int i = 0; i < amzProductAsins.Count; i += 100)
{
chunks.Add(amzProductAsins.Skip(i).Take(100).ToArray());
}
for(var i = 0; i < source.Count; i += chunkSize)
{
List<string> items = source.GetRange(i, Math.Min(chunkSize, source.Count - i));
}
虽然不如Array.Copy快,但我认为这样看起来更加简洁:
var list = Enumerable.Range(0, 723748).ToList();
var stopwatch = new Stopwatch();
for (int n = 0; n < 5; n++)
{
stopwatch.Reset();
stopwatch.Start();
for(int i = 0; i < list.Count; i += 100)
{
List<int> c = list.GetRange(i, Math.Min(100, list.Count - i));
}
stopwatch.Stop();
Console.WriteLine("List<T>.GetRange: " + stopwatch.ElapsedMilliseconds.ToString());
stopwatch.Reset();
stopwatch.Start();
for (int i = 0; i < list.Count; i += 100)
{
List<int> c = list.Skip(i).Take(100).ToList();
}
stopwatch.Stop();
Console.WriteLine("Skip/Take: " + stopwatch.ElapsedMilliseconds.ToString());
stopwatch.Reset();
stopwatch.Start();
var test = list.ToArray();
for (int i = 0; i < list.Count; i += 100)
{
int length = Math.Min(100, list.Count - i);
int[] c = new int[length];
Array.Copy(test, i, c, 0, length);
}
stopwatch.Stop();
Console.WriteLine("Array.Copy: " + stopwatch.ElapsedMilliseconds.ToString());
stopwatch.Reset();
stopwatch.Start();
List<List<int>> chunks = list
.Select((s, i) => new { Value = s, Index = i })
.GroupBy(x => x.Index / 100)
.Select(grp => grp.Select(x => x.Value).ToList())
.ToList();
stopwatch.Stop();
Console.WriteLine("LINQ: " + stopwatch.ElapsedMilliseconds.ToString());
}
毫秒级结果:
List<T>.GetRange: 1
Skip/Take: 9820
Array.Copy: 1
LINQ: 161
List<T>.GetRange: 9
Skip/Take: 9237
Array.Copy: 1
LINQ: 148
List<T>.GetRange: 5
Skip/Take: 9470
Array.Copy: 1
LINQ: 186
List<T>.GetRange: 0
Skip/Take: 9498
Array.Copy: 1
LINQ: 110
List<T>.GetRange: 8
Skip/Take: 9717
Array.Copy: 1
LINQ: 148
public static IEnumerable<IEnumerable<T>> SplitList<T>(this IEnumerable<T> source, int maxPerList)
{
var enumerable = source as IList<T> ?? source.ToList();
if (!enumerable.Any())
{
return new List<IEnumerable<T>>();
}
return (new List<IEnumerable<T>>() { enumerable.Take(maxPerList) }).Concat(enumerable.Skip(maxPerList).SplitList<T>(maxPerList));
}
int[] arrInput = new int[] { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 };
var result = SplitArrey(arrInput, 5);
foreach (var item in result) {
Console.WriteLine(" {0}", String.Join(" ", item));
}
这个函数是:
public static List<int[]> SplitArrey(int[] arrInput, int nColumn) {
List<int[]> result = new List<int[]>(nColumn);
int itemsForColum = arrInput.Length / nColumn;
int countSpareElement = arrInput.Length - (itemsForColum * nColumn);
// Add and extra space for the spare element
int[] newColumLenght = new int[nColumn];
for (int i = 0; i < nColumn; i++)
{
int addOne = (i < countSpareElement) ? 1 : 0;
newColumLenght[i] = itemsForColum + addOne;
result.Add(new int[itemsForColum + addOne]);
}
// Copy the values
int offset = 0;
for (int i = 0; i < nColumn; i++)
{
int count_items_to_copy = newColumLenght[i];
Array.Copy(arrInput, offset, result[i], 0, count_items_to_copy);
offset += newColumLenght[i];
}
return result;
}
结果是:
1 2 3
4 5 6
7 8
9 10
11 12