递归列表拍平

41

我可能可以自己写这个代码,但是我尝试编写一个通用的扩展方法,类似于.NET 3.5中引入的其他扩展方法,它将获取一个嵌套的IEnumerable集合(等等)并将其展开为一个IEnumerable。有人有什么想法吗?

具体来说,我在扩展方法的语法上遇到了问题,以便我可以编写展平算法。


你的数据是如何表示的?与XPath一样简单的方法并不存在。 - Ray Hayes
4
没有任何迹象表明这是一道作业问题... - Tim Frey
数据只是普通的对象。 - Matt H
非法程序员:没有明确的内容,但是当问题涉及语法并且没有代码时,很有可能这是一份作业。(但发帖者可以自由删除此信息。) - Jon Ericson
相关(Eric Lippert的使用堆栈处理更大数据结构的解决方案):https://dev59.com/1Ggt5IYBdhLWcg3wxANj#20335369 - Mafii
13个回答

50

这里有一个可能会有所帮助的扩展。它将遍历对象层次结构中的所有节点,并挑选出符合条件的节点。它假设您的层次结构中的每个对象都有一个集合属性,用于保存其子对象。

以下是该扩展:

/// Traverses an object hierarchy and return a flattened list of elements
/// based on a predicate.
/// 
/// TSource: The type of object in your collection.</typeparam>
/// source: The collection of your topmost TSource objects.</param>
/// selectorFunction: A predicate for choosing the objects you want.
/// getChildrenFunction: A function that fetches the child collection from an object.
/// returns: A flattened list of objects which meet the criteria in selectorFunction.
public static IEnumerable<TSource> Map<TSource>(
  this IEnumerable<TSource> source,
  Func<TSource, bool> selectorFunction,
  Func<TSource, IEnumerable<TSource>> getChildrenFunction)
{
  // Add what we have to the stack
  var flattenedList = source.Where(selectorFunction);

  // Go through the input enumerable looking for children,
  // and add those if we have them
  foreach (TSource element in source)
  {
    flattenedList = flattenedList.Concat(
      getChildrenFunction(element).Map(selectorFunction,
                                       getChildrenFunction)
    );
  }
  return flattenedList;
}

示例(单元测试):

首先,我们需要一个对象和一个嵌套的对象层次结构。

一个简单的节点类。

class Node
{
  public int NodeId { get; set; }
  public int LevelId { get; set; }
  public IEnumerable<Node> Children { get; set; }

  public override string ToString()
  {
    return String.Format("Node {0}, Level {1}", this.NodeId, this.LevelId);
  }
}

并且提供一种获取三级节点深度层次结构的方法

private IEnumerable<Node> GetNodes()
{
  // Create a 3-level deep hierarchy of nodes
  Node[] nodes = new Node[]
    {
      new Node 
      { 
        NodeId = 1, 
        LevelId = 1, 
        Children = new Node[]
        {
          new Node { NodeId = 2, LevelId = 2, Children = new Node[] {} },
          new Node
          {
            NodeId = 3,
            LevelId = 2,
            Children = new Node[]
            {
              new Node { NodeId = 4, LevelId = 3, Children = new Node[] {} },
              new Node { NodeId = 5, LevelId = 3, Children = new Node[] {} }
            }
          }
        }
      },
      new Node { NodeId = 6, LevelId = 1, Children = new Node[] {} }
    };
  return nodes;
}

第一次测试:展平层次结构,不进行过滤

[Test]
public void Flatten_Nested_Heirachy()
{
  IEnumerable<Node> nodes = GetNodes();
  var flattenedNodes = nodes.Map(
    p => true, 
    (Node n) => { return n.Children; }
  );
  foreach (Node flatNode in flattenedNodes)
  {
    Console.WriteLine(flatNode.ToString());
  }

  // Make sure we only end up with 6 nodes
  Assert.AreEqual(6, flattenedNodes.Count());
}

这将显示:

Node 1, Level 1
Node 6, Level 1
Node 2, Level 2
Node 3, Level 2
Node 4, Level 3
Node 5, Level 3

第二个测试:获取具有偶数NodeId的节点列表

[Test]
public void Only_Return_Nodes_With_Even_Numbered_Node_IDs()
{
  IEnumerable<Node> nodes = GetNodes();
  var flattenedNodes = nodes.Map(
    p => (p.NodeId % 2) == 0, 
    (Node n) => { return n.Children; }
  );
  foreach (Node flatNode in flattenedNodes)
  {
    Console.WriteLine(flatNode.ToString());
  }
  // Make sure we only end up with 3 nodes
  Assert.AreEqual(3, flattenedNodes.Count());
}

这将显示:

Node 6, Level 1
Node 2, Level 2
Node 4, Level 3

@Marc 感谢您的编辑!@lukevenediger 如果您首先将TreeNode集合转换为IEnumerable兼容形式,例如:IEnumerable<TreeNode> theNodes1 = treeView2.Nodes.OfType<TreeNode>(); 您可以使用Map扩展来处理TreeView的TreeNode集合。或者使用以下方法:IEnumerable<TreeNode> theNodes2 = treeView2.Nodes.Cast<TreeNode>().Select(node => node); - BillW
1
只是一点小建议,单元测试中返回值的方式可以更简洁。你可以这样调用:nodes.Map(i => true, n => n.Children)。我还建议不要将这个扩展方法命名为“Map”,“Flatten”更清晰地描述了它的功能。 - Amicable

20

嗯...我不确定您想要什么内容,但这是一个“单层次”选项:

public static IEnumerable<TElement> Flatten<TElement,TSequence> (this IEnumerable<TSequence> sequences)
    where TSequence : IEnumerable<TElement> 
{
    foreach (TSequence sequence in sequences)
    {
        foreach(TElement element in sequence)
        {
            yield return element;
        }
    }
}

如果这不是你想要的,请提供你所需要的签名。如果你不需要一个通用的表单,而只是想做LINQ到XML构造函数所做的事情,那么这是相当简单的 - 尽管迭代器块的递归使用相对较低效。可以尝试以下代码:

static IEnumerable Flatten(params object[] objects)
{
    // Can't easily get varargs behaviour with IEnumerable
    return Flatten((IEnumerable) objects);
}

static IEnumerable Flatten(IEnumerable enumerable)
{
    foreach (object element in enumerable)
    {
        IEnumerable candidate = element as IEnumerable;
        if (candidate != null)
        {
            foreach (object nested in candidate)
            {
                yield return nested;
            }
        }
        else
        {
            yield return element;
        }
    }
}

请注意,这将把一个字符串视为一系列字符,但是 - 根据您的用例,您可能希望将字符串特殊处理为单独的元素,而不是将它们展开。
这有帮助吗?

15

我想分享一个完整的例子,包括错误处理和单一逻辑方法。

递归展开就像这样简单:

LINQ 版本

public static class IEnumerableExtensions
{
    public static IEnumerable<T> SelectManyRecursive<T>(this IEnumerable<T> source, Func<T, IEnumerable<T>> selector)
    {
        if (source == null) throw new ArgumentNullException("source");
        if (selector == null) throw new ArgumentNullException("selector");

        return !source.Any() ? source :
            source.Concat(
                source
                .SelectMany(i => selector(i).EmptyIfNull())
                .SelectManyRecursive(selector)
            );
    }

    public static IEnumerable<T> EmptyIfNull<T>(this IEnumerable<T> source)
    {
        return source ?? Enumerable.Empty<T>();
    }
}

非LINQ版本

public static class IEnumerableExtensions
{
    public static IEnumerable<T> SelectManyRecursive<T>(this IEnumerable<T> source, Func<T, IEnumerable<T>> selector)
    {
        if (source == null) throw new ArgumentNullException("source");
        if (selector == null) throw new ArgumentNullException("selector");

        foreach (T item in source)
        {
            yield return item;

            var children = selector(item);
            if (children == null)
                continue;

            foreach (T descendant in children.SelectManyRecursive(selector))
            {
                yield return descendant;
            }
        }
    }
}

设计决策

我决定:

  • 禁止对空的IEnumerable进行展开,可以通过删除异常抛出并执行以下操作来更改:
    • 在第一个版本中,在return之前添加source = source.EmptyIfNull();
    • 在第二个版本中,在foreach之前添加if (source != null)
  • 允许选择器返回空集合 - 这样可以将责任从调用者身上转移,以确保子列表不为空,可以通过以下方式进行更改:
    • 在第一个版本中删除.EmptyIfNull() - 请注意,如果选择器返回null,则SelectMany将失败
    • 在第二个版本中删除if (children == null) continue; - 请注意,如果IEnumerable参数为null,则foreach将失败
  • 允许在调用方或子选择器中使用.Where子句过滤子项,而不是传递一个子项过滤选择器参数:
    • 这不会影响效率,因为在两个版本中都是延迟调用
    • 这将把另一个逻辑与方法混合在一起,我更喜欢将逻辑分开

示例用法

我在LightSwitch中使用此扩展方法来获取屏幕上的所有控件:

public static class ScreenObjectExtensions
{
    public static IEnumerable<IContentItemProxy> FindControls(this IScreenObject screen)
    {
        var model = screen.Details.GetModel();

        return model.GetChildItems()
            .SelectManyRecursive(c => c.GetChildItems())
            .OfType<IContentItemDefinition>()
            .Select(c => screen.FindControl(c.Name));
    }
}

谢谢。我自己的版本中只是缺少了 !source.Any() ? source 部分。 - FindOutIslamNow

7

这是一个修改过的Jon Skeet的答案,允许多级:

static IEnumerable Flatten(IEnumerable enumerable)
{
    foreach (object element in enumerable)
    {
        IEnumerable candidate = element as IEnumerable;
        if (candidate != null)
        {
            foreach (object nested in Flatten(candidate))
            {
                yield return nested;
            }
        }
        else
        {
            yield return element;
        }
    }
}

声明:我不懂C#。

同样的Python代码:

#!/usr/bin/env python

def flatten(iterable):
    for item in iterable:
        if hasattr(item, '__iter__'):
            for nested in flatten(item):
                yield nested
        else:
            yield item

if __name__ == '__main__':
    for item in flatten([1,[2, 3, [[4], 5]], 6, [[[7]]], [8]]):
        print(item, end=" ")

它打印:
1 2 3 4 5 6 7 8 

6
这不是[SelectMany][1]的作用吗?
enum1.SelectMany(
    a => a.SelectMany(
        b => b.SelectMany(
            c => c.Select(
                d => d.Name
            )
        )
    )
);

8
这并不是完全递归,它只处理预定义的层数,在这种情况下是3。真正的递归可以处理无限层数。 - Peter

4

功能:

public static class MyExtentions
{
    public static IEnumerable<T> RecursiveSelector<T>(this IEnumerable<T> nodes, Func<T, IEnumerable<T>> selector)
    {
        if(nodes.Any() == false)
        {
            return nodes; 
        }

        var descendants = nodes
                            .SelectMany(selector)
                            .RecursiveSelector(selector);

        return nodes.Concat(descendants);
    } 
}

使用方法:

var ar = new[]
{
    new Node
    {
        Name = "1",
        Chilren = new[]
        {
            new Node
            {
                Name = "11",
                Children = new[]
                {
                    new Node
                    {
                        Name = "111",
                        
                    }
                }
            }
        }
    }
};

var flattened = ar.RecursiveSelector(x => x.Children).ToList();

2

好的,这里是另一个版本,结合了上面大约3个答案。

递归。使用yield。通用的。可选的过滤器谓词。可选的选择函数。尽可能简洁。

    public static IEnumerable<TNode> Flatten<TNode>(
        this IEnumerable<TNode> nodes, 
        Func<TNode, bool> filterBy = null,
        Func<TNode, IEnumerable<TNode>> selectChildren = null
        )
    {
        if (nodes == null) yield break;
        if (filterBy != null) nodes = nodes.Where(filterBy);

        foreach (var node in nodes)
        {
            yield return node;

            var children = (selectChildren == null)
                ? node as IEnumerable<TNode>
                : selectChildren(node);

            if (children == null) continue;

            foreach (var child in children.Flatten(filterBy, selectChildren))
            {
                yield return child;
            }
        }
    }

使用方法:

// With filter predicate, with selection function
var flatList = nodes.Flatten(n => n.IsDeleted == false, n => n.Children);

1
由于VB中没有yield,而LINQ提供了延迟执行和简洁的语法,因此您也可以使用它。
<Extension()>
Public Function Flatten(Of T)(ByVal objects As Generic.IEnumerable(Of T), ByVal selector As Func(Of T, Generic.IEnumerable(Of T))) As Generic.IEnumerable(Of T)
   If(objects.Any()) Then
      Return objects.Union(objects.Select(selector).Where(e => e != null).SelectMany(e => e)).Flatten(selector))
   Else
      Return objects 
   End If
End Function

public static class Extensions{
  public static IEnumerable<T> Flatten<T>(this IEnumerable<T> objects, Func<T, IEnumerable<T>> selector) where T:Component{
    if(objects.Any()){
        return objects.Union(objects.Select(selector).Where(e => e != null).SelectMany(e => e).Flatten(selector));
    }
    return objects;
  }
}

编辑 包括:


1

SelectMany 扩展方法已经实现了这个功能。

将序列的每个元素投影到一个 IEnumerable<(Of <(T>)>) 中,并将生成的序列平坦化为一个序列。


11
SelectMany 不是递归的。它只会对第一层级进行处理。 - Sarin

1
我不得不从头开始实现我的解决方案,因为所有提供的解决方案都会在出现循环引用,即子元素指向其祖先元素时出错。如果你有和我一样的需求,请看一下这个(也请告诉我我的解决方案是否会在特殊情况下出错):
使用方法:
var flattenlist = rootItem.Flatten(obj => obj.ChildItems, obj => obj.Id)

代码:

public static class Extensions
    {
        /// <summary>
        /// This would flatten out a recursive data structure ignoring the loops. The end result would be an enumerable which enumerates all the
        /// items in the data structure regardless of the level of nesting.
        /// </summary>
        /// <typeparam name="T">Type of the recursive data structure</typeparam>
        /// <param name="source">Source element</param>
        /// <param name="childrenSelector">a function that returns the children of a given data element of type T</param>
        /// <param name="keySelector">a function that returns a key value for each element</param>
        /// <returns>a faltten list of all the items within recursive data structure of T</returns>
        public static IEnumerable<T> Flatten<T>(this IEnumerable<T> source,
            Func<T, IEnumerable<T>> childrenSelector,
            Func<T, object> keySelector) where T : class
        {
            if (source == null)
                throw new ArgumentNullException("source");
            if (childrenSelector == null)
                throw new ArgumentNullException("childrenSelector");
            if (keySelector == null)
                throw new ArgumentNullException("keySelector");
            var stack = new Stack<T>( source);
            var dictionary = new Dictionary<object, T>();
            while (stack.Any())
            {
                var currentItem = stack.Pop();
                var currentkey = keySelector(currentItem);
                if (dictionary.ContainsKey(currentkey) == false)
                {
                    dictionary.Add(currentkey, currentItem);
                    var children = childrenSelector(currentItem);
                    if (children != null)
                    {
                        foreach (var child in children)
                        {
                            stack.Push(child);
                        }
                    }
                }
                yield return currentItem;
            }
        }

        /// <summary>
        /// This would flatten out a recursive data structure ignoring the loops. The     end result would be an enumerable which enumerates all the
        /// items in the data structure regardless of the level of nesting.
        /// </summary>
        /// <typeparam name="T">Type of the recursive data structure</typeparam>
        /// <param name="source">Source element</param>
        /// <param name="childrenSelector">a function that returns the children of a     given data element of type T</param>
        /// <param name="keySelector">a function that returns a key value for each   element</param>
        /// <returns>a faltten list of all the items within recursive data structure of T</returns>
        public static IEnumerable<T> Flatten<T>(this T source, 
            Func<T, IEnumerable<T>> childrenSelector,
            Func<T, object> keySelector) where T: class
        {
            return Flatten(new [] {source}, childrenSelector, keySelector);
        }
    }

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接