我有一个数字列表,例如21,4,7,9,12,22,17,8,2,20,23。
我想要挑出连续数字序列(长度至少为3项),因此从上面的例子中,答案应该是7,8,9和20,21,22,23。
我已经尝试了一些复杂且冗长的函数,但我想知道是否有一种简洁的LINQ-ish方式来实现它。
有什么建议吗?
更新:
非常感谢所有回复,十分感激。我目前正在测试它们,看哪个最适合我们的项目。
我认为你应该首先对列表进行排序。之后,只需要遍历它,记住当前序列的长度并检测何时它已结束。说实话,我 怀疑 一个简单的foreach循环将是最简单的方法 - 我不能立即想到任何非常简洁的类似于LINQ的方法。如果你真的想这样做,可以在迭代器块中完成,但请记住,首先对列表进行排序意味着你已经有了相当“前期”的成本。所以我的解决方案会像这样:
var ordered = list.OrderBy(x => x);
int count = 0;
int firstItem = 0; // Irrelevant to start with
foreach (int x in ordered)
{
// First value in the ordered list: start of a sequence
if (count == 0)
{
firstItem = x;
count = 1;
}
// Skip duplicate values
else if (x == firstItem + count - 1)
{
// No need to do anything
}
// New value contributes to sequence
else if (x == firstItem + count)
{
count++;
}
// End of one sequence, start of another
else
{
if (count >= 3)
{
Console.WriteLine("Found sequence of length {0} starting at {1}",
count, firstItem);
}
count = 1;
firstItem = x;
}
}
if (count >= 3)
{
Console.WriteLine("Found sequence of length {0} starting at {1}",
count, firstItem);
}
编辑:好吧,我刚想到一种更符合LINQ思路的方法。我现在没有时间完全实现它,但是:
SelectWithPrevious
(可能更好地命名为SelectConsecutive
)的东西来获取连续的元素对SelectWithPrevious
,以获取两个起始点之间序列的长度(从前面减去一个索引)我猜测你需要在有序序列上连接int.MinValue
,以确保最后一个项目被正确使用。
编辑:好吧,我已经实现了这个方法。这大概是我能想到的最LINQ的方式...我使用空值作为“哨兵”值强制启动和结束序列-有关详细信息,请参见注释。
总的来说,我不建议使用这个解决方案。它很难理解,虽然我相当有信心它是正确的,但我还是花了一段时间思考可能出现的偏移错误等问题。这是进入LINQ可以做什么的有趣之旅...以及您可能不应该做什么的例子。
哦,还要注意,我将“长度至少为3”部分上推到了调用者 - 当你有这样的元组序列时,单独过滤它更清洁,我认为。
using System;
using System.Collections.Generic;
using System.Linq;
static class Extensions
{
public static IEnumerable<TResult> SelectConsecutive<TSource, TResult>
(this IEnumerable<TSource> source,
Func<TSource, TSource, TResult> selector)
{
using (IEnumerator<TSource> iterator = source.GetEnumerator())
{
if (!iterator.MoveNext())
{
yield break;
}
TSource prev = iterator.Current;
while (iterator.MoveNext())
{
TSource current = iterator.Current;
yield return selector(prev, current);
prev = current;
}
}
}
}
class Test
{
static void Main()
{
var list = new List<int> { 21,4,7,9,12,22,17,8,2,20,23 };
foreach (var sequence in FindSequences(list).Where(x => x.Item1 >= 3))
{
Console.WriteLine("Found sequence of length {0} starting at {1}",
sequence.Item1, sequence.Item2);
}
}
private static readonly int?[] End = { null };
// Each tuple in the returned sequence is (length, first element)
public static IEnumerable<Tuple<int, int>> FindSequences
(IEnumerable<int> input)
{
// Use null values at the start and end of the ordered sequence
// so that the first pair always starts a new sequence starting
// with the lowest actual element, and the final pair always
// starts a new one starting with null. That "sequence at the end"
// is used to compute the length of the *real* final element.
return End.Concat(input.OrderBy(x => x)
.Select(x => (int?) x))
.Concat(End)
// Work out consecutive pairs of items
.SelectConsecutive((x, y) => Tuple.Create(x, y))
// Remove duplicates
.Where(z => z.Item1 != z.Item2)
// Keep the index so we can tell sequence length
.Select((z, index) => new { z, index })
// Find sequence starting points
.Where(both => both.z.Item2 != both.z.Item1 + 1)
.SelectConsecutive((start1, start2) =>
Tuple.Create(start2.index - start1.index,
start1.z.Item2.Value));
}
}
[5, 1, 2, 3]
不会找到 1,2,3
。 - TimwiDistinct()
方法进行修复。 - Jon SkeetJon Skeet和Timwi的解决方案是正确的选择。
为了好玩,这里有一个执行此任务的LINQ查询(非常低效):
var sequences = input.Distinct()
.GroupBy(num => Enumerable.Range(num, int.MaxValue - num + 1)
.TakeWhile(input.Contains)
.Last()) //use the last member of the consecutive sequence as the key
.Where(seq => seq.Count() >= 3)
.Select(seq => seq.OrderBy(num => num)); // not necessary unless ordering is desirable inside each sequence.
查询的性能可以通过将输入加载到 HashSet
中(以改善Contains
)稍微提高,但这仍然不能产生任何接近高效的解决方案。Range
的count
参数),可能会出现算术溢出。这可以通过自定义的 static IEnumerable<int> To(this int start, int end)
扩展方法轻松修复。如果有人能想出其他简单的规避溢出的技巧,请让我知道。var sequences = input.GroupBy(num => input.Where(candidate => candidate >= num)
.OrderBy(candidate => candidate)
.TakeWhile((candidate, index) => candidate == num + index)
.Last())
.Where(seq => seq.Count() >= 3)
.Select(seq => seq.OrderBy(num => num));
-1
也会因为溢出而失败。实际上,由于调用了Enumerable.Range(num, int.MaxValue - num + 1)
,所以0
也会失败。 - configuratorint.MaxValue
呢? - configuratorcount
中使用int.MaxValue
,这是不可能的,因为它会无法处理正数。还有其他建议吗?我认为Enumerable.Range
将始终存在此错误,因为无法将Int32的范围表示为Int32。 - Ani我认为我的解决方案更加优雅和简洁,因此更容易验证其正确性:
/// <summary>Returns a collection containing all consecutive sequences of
/// integers in the input collection.</summary>
/// <param name="input">The collection of integers in which to find
/// consecutive sequences.</param>
/// <param name="minLength">Minimum length that a sequence should have
/// to be returned.</param>
static IEnumerable<IEnumerable<int>> ConsecutiveSequences(
IEnumerable<int> input, int minLength = 1)
{
var results = new List<List<int>>();
foreach (var i in input.OrderBy(x => x))
{
var existing = results.FirstOrDefault(lst => lst.Last() + 1 == i);
if (existing == null)
results.Add(new List<int> { i });
else
existing.Add(i);
}
return minLength <= 1 ? results :
results.Where(lst => lst.Count >= minLength);
}
与其他解决方案相比的优点:
i
来表示 arr[i]
... 所以你能否只使用 foreach 循环?在我看来,这肯定会简化事情。 - Jon SkeetList<T>
本地变量并持续yield的方法呢? - Ani以下是以“LINQish”方式解决问题的方法:
int[] arr = new int[]{ 21, 4, 7, 9, 12, 22, 17, 8, 2, 20, 23 };
IOrderedEnumerable<int> sorted = arr.OrderBy(x => x);
int cnt = sorted.Count();
int[] sortedArr = sorted.ToArray();
IEnumerable<int> selected = sortedArr.Where((x, idx) =>
idx <= cnt - 3 && sortedArr[idx + 1] == x + 1 && sortedArr[idx + 2] == x + 2);
IEnumerable<int> result = selected.SelectMany(x => new int[] { x, x + 1, x + 2 }).Distinct();
Console.WriteLine(string.Join(",", result.Select(x=>x.ToString()).ToArray()));
[1, 2, 3, 2]
,它无法找到 1, 2, 3
。在你的情况下,修复方法更简单:只需在 .OrderBy()
后添加 .Distinct()
(并将 IOrderedEnumerable
更改为 IEnumerable
,甚至更好的是,将所有内容都更改为 var
)。 - Timwi虽然不是100%的Linq,但这是一个通用的变体:
static IEnumerable<IEnumerable<TItem>> GetSequences<TItem>(
int minSequenceLength,
Func<TItem, TItem, bool> areSequential,
IEnumerable<TItem> items)
where TItem : IComparable<TItem>
{
items = items
.OrderBy(n => n)
.Distinct().ToArray();
var lastSelected = default(TItem);
var sequences =
from startItem in items
where startItem.Equals(items.First())
|| startItem.CompareTo(lastSelected) > 0
let sequence =
from item in items
where item.Equals(startItem) || areSequential(lastSelected, item)
select (lastSelected = item)
where sequence.Count() >= minSequenceLength
select sequence;
return sequences;
}
static void UsageInt()
{
var sequences = GetSequences(
3,
(a, b) => a + 1 == b,
new[] { 21, 4, 7, 9, 12, 22, 17, 8, 2, 20, 23 });
foreach (var sequence in sequences)
Console.WriteLine(string.Join(", ", sequence.ToArray()));
}
static void UsageChar()
{
var list = new List<char>(
"abcdefghijklmnopqrstuvwxyz".ToCharArray());
var sequences = GetSequences(
3,
(a, b) => (list.IndexOf(a) + 1 == list.IndexOf(b)),
"PleaseBeGentleWithMe".ToLower().ToCharArray());
foreach (var sequence in sequences)
Console.WriteLine(string.Join(", ", sequence.ToArray()));
}
What about sorting the array then create another array that is the difference between each element the previous one
sortedArray = 8, 9, 10, 21, 22, 23, 24, 27, 30, 31, 32
diffArray = 1, 1, 11, 1, 1, 1, 3, 3, 1, 1
Now iterate through the difference array; if the difference equlas 1, increase the count of a variable, sequenceLength, by 1. If the difference is > 1, check the sequenceLength if it is >=2 then you have a sequence of at at least 3 consecutive elements. Then reset sequenceLenght to 0 and continue your loop on the difference array.
这是我的尝试:
public static class SequenceDetector
{
public static IEnumerable<IEnumerable<T>> DetectSequenceWhere<T>(this IEnumerable<T> sequence, Func<T, T, bool> inSequenceSelector)
{
List<T> subsequence = null;
// We can only have a sequence with 2 or more items
T last = sequence.FirstOrDefault();
foreach (var item in sequence.Skip(1))
{
if (inSequenceSelector(last, item))
{
// These form part of a sequence
if (subsequence == null)
{
subsequence = new List<T>();
subsequence.Add(last);
}
subsequence.Add(item);
}
else if (subsequence != null)
{
// We have a previous seq to return
yield return subsequence;
subsequence = null;
}
last = item;
}
if (subsequence != null)
{
// Return any trailing seq
yield return subsequence;
}
}
}
public class test
{
public static void run()
{
var list = new List<int> { 21, 4, 7, 9, 12, 22, 17, 8, 2, 20, 23 };
foreach (var subsequence in list
.OrderBy(i => i)
.Distinct()
.DetectSequenceWhere((first, second) => first + 1 == second)
.Where(seq => seq.Count() >= 3))
{
Console.WriteLine("Found subsequence {0}",
string.Join(", ", subsequence.Select(i => i.ToString()).ToArray()));
}
}
}
这将返回形成子序列的特定项,并允许任何类型的项和任何标准的定义,只要可以通过比较相邻项来确定。
这里是我用F#想出来的一个解决方案,将其转换成C# LINQ查询应该很容易,因为fold基本上等同于LINQ的aggregate运算符。
#light
let nums = [21;4;7;9;12;22;17;8;2;20;23]
let scanFunc (mainSeqLength, mainCounter, lastNum:int, subSequenceCounter:int, subSequence:'a list, foundSequences:'a list list) (num:'a) =
(mainSeqLength, mainCounter + 1,
num,
(if num <> lastNum + 1 then 1 else subSequenceCounter+1),
(if num <> lastNum + 1 then [num] else subSequence@[num]),
if subSequenceCounter >= 3 then
if mainSeqLength = mainCounter+1
then foundSequences @ [subSequence@[num]]
elif num <> lastNum + 1
then foundSequences @ [subSequence]
else foundSequences
else foundSequences)
let subSequences = nums |> Seq.sort |> Seq.fold scanFunc (nums |> Seq.length, 0, 0, 0, [], []) |> fun (_,_,_,_,_,results) -> results
Linq并不是万能的解决方案,有时候使用简单的循环会更好。这里提供了一个解决方案,只需使用少量的Linq来对原始序列进行排序和过滤结果。
void Main()
{
var numbers = new[] { 21,4,7,9,12,22,17,8,2,20,23 };
var sequences =
GetSequences(numbers, (prev, curr) => curr == prev + 1);
.Where(s => s.Count() >= 3);
sequences.Dump();
}
public static IEnumerable<IEnumerable<T>> GetSequences<T>(
IEnumerable<T> source,
Func<T, T, bool> areConsecutive)
{
bool first = true;
T prev = default(T);
List<T> seq = new List<T>();
foreach (var i in source.OrderBy(i => i))
{
if (!first && !areConsecutive(prev, i))
{
yield return seq.ToArray();
seq.Clear();
}
first = false;
seq.Add(i);
prev = i;
}
if (seq.Any())
yield return seq.ToArray();
}
我和Jon想到了同样的事情:要表示一系列连续的整数,你只需要两个微不足道的整数!所以我会从这里开始:
struct Range : IEnumerable<int>
{
readonly int _start;
readonly int _count;
public Range(int start, int count)
{
_start = start;
_count = count;
}
public int Start
{
get { return _start; }
}
public int Count
{
get { return _count; }
}
public int End
{
get { return _start + _count - 1; }
}
public IEnumerator<int> GetEnumerator()
{
for (int i = 0; i < _count; ++i)
{
yield return _start + i;
}
}
// Heck, why not?
public static Range operator +(Range x, int y)
{
return new Range(x.Start, x.Count + y);
}
// skipping the explicit IEnumerable.GetEnumerator implementation
}
从那里开始,您可以编写一个静态方法,返回一堆这些Range
值,对应于您的序列连续数字。
static IEnumerable<Range> FindRanges(IEnumerable<int> source, int minCount)
{
// throw exceptions on invalid arguments, maybe...
var ordered = source.OrderBy(x => x);
Range r = default(Range);
foreach (int value in ordered)
{
// In "real" code I would've overridden the Equals method
// and overloaded the == operator to write something like
// if (r == Range.Empty) here... but this works well enough
// for now, since the only time r.Count will be 0 is on the
// first item.
if (r.Count == 0)
{
r = new Range(value, 1);
continue;
}
if (value == r.End)
{
// skip duplicates
continue;
}
else if (value == r.End + 1)
{
// "append" consecutive values to the range
r += 1;
}
else
{
// return what we've got so far
if (r.Count >= minCount)
{
yield return r;
}
// start over
r = new Range(value, 1);
}
}
// return whatever we ended up with
if (r.Count >= minCount)
{
yield return r;
}
}
演示:
int[] numbers = new[] { 21, 4, 7, 9, 12, 22, 17, 8, 2, 20, 23 };
foreach (Range r in FindConsecutiveRanges(numbers, 3))
{
// Using .NET 3.5 here, don't have the much nicer string.Join overloads.
Console.WriteLine(string.Join(", ", r.Select(x => x.ToString()).ToArray()));
}
输出:
7、8、9 20、21、22、23