使用LINQ分组交替对配对进行分组

12

我正在尝试对包含交替家庭配对的DTOs列表进行分组,以以下格式进行分组,以最小化重复。

这是我目前拥有的DTO结构,你可以看到它具有重复行,可以根据反向关系进行分组。

+----------+------------+-----------+
| PersonId | RelativeId | Relation  |
+----------+------------+-----------+
|        1 |          2 | "Son"     |
|        2 |          1 | "Father"  |
|        1 |          3 | "Mother"  |
|        3 |          1 | "Son"     |
|        2 |          3 | "Husband" |
|        3 |          2 | "Wife"    |
+----------+------------+-----------+

转化为类似以下的内容:

+----------+------------+-----------+-----------------+
| PersonId | RelativeId | Relation  | ReverseRelation |
+----------+------------+-----------+-----------------+
|        1 |          2 | "Son"     | "Father"        |
|        1 |          3 | "Mother"  | "Son"           |
|        2 |          3 | "Husband" | "Wife"          |
+----------+------------+-----------+-----------------+

我正在尝试的代码:

Program.cs

class Program
{
    static void Main(string[] args)
    {
        List<RelationDTO> relationDTOList = new List<RelationDTO>
        {
            new RelationDTO { PersonId = 1, RelativeId = 2, Relation = "Son" },
            new RelationDTO { PersonId = 2, RelativeId = 1, Relation = "Father" },

            new RelationDTO { PersonId = 1, RelativeId = 3, Relation = "Mother" },
            new RelationDTO { PersonId = 3, RelativeId = 1, Relation = "Son" },

            new RelationDTO { PersonId = 2, RelativeId = 3, Relation = "Husband" },
            new RelationDTO { PersonId = 3, RelativeId = 2, Relation = "Wife" },
        };

        var grp = relationDTOList.GroupBy(x => new { x.PersonId }).ToList();
    }
}

RelationDTO.cs

public class RelationDTO
{
    public int PersonId { get; set; }
    public int RelativeId { get; set; }
    public string Relation { get; set; }
}

Relations.cs

public class Relations
{
    public int PersonId { get; set; }
    public int RelativeId { get; set; }
    public string Relation { get; set; }
    public string ReverseRelation { get; set; }
}
6个回答

8

您可以使用类似于联接操作的方式

var result = relationDTOList
.Where(v => v.PersonId < v.RelativeId)
.Join(
    relationDTOList.Where(v => v.PersonId > v.RelativeId),
    v => new Key{PersonId = v.PersonId, RelativeId = v.RelativeId},
    v => new Key{PersonId = v.RelativeId, RelativeId = v.PersonId},
    (p, q) => new Relations
    {
        PersonId = p.PersonId,
        RelativeId = p.RelativeId,
        Relation = p.Relation,
        ReverseRelation = q.Relation
    }
);

Key 是:

public struct Key
{
    public int PersonId { get; set; }
    public int RelativeId { get; set; }
}

7

我不确定这是否是您所需的内容:

public static void Main()
{
    List<RelationDTO> relationDTOList = new List<RelationDTO>
    {
        new RelationDTO { PersonId = 1, RelativeId = 2, Relation = "Son" },
        new RelationDTO { PersonId = 2, RelativeId = 1, Relation = "Father" },

        new RelationDTO { PersonId = 1, RelativeId = 3, Relation = "Mother" },
        new RelationDTO { PersonId = 3, RelativeId = 1, Relation = "Son" },

        new RelationDTO { PersonId = 2, RelativeId = 3, Relation = "Husband" },
        new RelationDTO { PersonId = 3, RelativeId = 2, Relation = "Wife" },
    };

    var grp = relationDTOList.Join(relationDTOList, 
            dto => dto.PersonId + "-" + dto.RelativeId, 
            dto => dto.RelativeId + "-" + dto.PersonId, 
    (dto1, dto2) => new Relations 
            { 
                PersonId = dto1.PersonId, 
                RelationId = dto1.RelativeId, 
                Relation = dto1.Relation, 
                ReverseRelation = dto2.Relation 
                }).Distinct(new MyEqualityComparer());

    foreach (var g in grp)
        Console.WriteLine("{0},{1},{2},{3}", g.PersonId, g.RelationId, g.Relation, g.ReverseRelation);
}

public class MyEqualityComparer : IEqualityComparer<Relations>
{
    public bool Equals(Relations x, Relations y)
    {
        return x.PersonId + "-" + x.RelationId == y.PersonId + "-" + y.RelationId || 
        x.PersonId + "-" + x.RelationId == y.RelationId + "-" + y.PersonId;
    }

    public int GetHashCode(Relations obj)
    {
        return 0;
    }
}

1
你能解释一下重载的 equals 方法吗? - Kunal Mukherjee
2
为了区分结果列表。因为1-2应该等于2-1,我们必须自己进行比较。 - ojlovecd
1
如果您假设原始列表是唯一的,那么您可以通过过滤器来获取一个不同的列表,使得personId < RelationId。 - Taemyr
3
请勿将两个整数字符串化后再进行比较,通常最好进行两次比较。 - Taemyr

6

我有些怀疑在这里使用LINQ是否是最佳选择,因为使用查找循环可能会更有效率。但是如果您确实需要使用LINQ,则可以按照以下方式操作:

var relations = from person in relationDTOList
    // Match on the exact pair of IDs
    join relative in relationDTOList on
        new { person.PersonId, person.RelativeId } equals
        new { PersonId = relative.RelativeId, RelativeId = relative.PersonId }

    // Build the new structure
    let relation = new Relations {
        PersonId = person.PersonId,
        Relation = person.Relation,
        RelativeId = relative.PersonId,
        ReverseRelation = relative.Relation
    }

    // Order the pairs to find the duplicates
    let ids = new[] {person.PersonId, relative.PersonId}.OrderBy(x => x).ToArray()
    group relation by new { FirstPersonId = ids[0], SecondPersonId = ids[1] }
    into relationGroups

    // Select only the the first of two duplicates
    select relationGroups.First();

这段代码的作用是将集合与相匹配的 PersonIdRelativeId 对在一起,然后过滤掉每个对中的第二条记录,这样就得到了一个集合,在该集合中,列表中找到的第一个人将被视为关系中的父母

编辑:我所说的查找方法:

var result = new List<Relations>();
while (relationDTOList.Any())
{
    var person = relationDTOList.First();
    relationDTOList.RemoveAt(0);

    var relative = relationDTOList.Where(x =>
            x.PersonId == person.RelativeId && x.RelativeId == person.PersonId)
        .Select((x, i) => new {Person = x, Index = i}).FirstOrDefault();

    if (relative != null)
    {
        relationDTOList.RemoveAt(relative.Index);
        result.Add(new Relations {
            PersonId = person.PersonId,
            Relation = person.Relation,
            RelativeId = relative.Person.PersonId,
            ReverseRelation = relative.Person.Relation
        });
    }
}

作为说明,它会清空你的原始列表,所以如果你需要在代码中进一步使用它,你需要制作一份副本 (list.ToList())。
运行这段代码的速度比我之前发布的使用 join 的方法快了约六倍。我还想出了以下分组方法,它的运行速度比 join 快得多,但仍然比 lookup-and-remove 方法慢,尽管它们做的事情非常相似。
var relations = relationDTOList.GroupBy(person =>
        person.PersonId < person.RelativeId
            ? new {FirstPersonId = person.PersonId, SecondPersonId = person.RelativeId}
            : new {FirstPersonId = person.RelativeId, SecondPersonId = person.PersonId})

    .Select(group => new Relations {
        PersonId = group.First().PersonId,
        Relation = group.First().Relation,
        RelativeId = group.First().RelativeId,
        ReverseRelation = group.Last().Relation
    });

你能否也发布一下你所提到的循环查找方法? - Kunal Mukherjee
1
@KunalMukherjee 我添加了代码以及运行不同版本时得到的一些见解。 - Imantas
groupby版本是一种不需要清空列表的好的函数式方法。 - Kunal Mukherjee

4
var query = relationDTOList.OrderBy(x=>x.PersonId).GroupJoin(relationDTOList,
p => p.PersonId,
a => a.RelativeId,
(p, al) =>
new
{
     p.PersonId,
     p.RelativeId,
     p.Relation,
     Parrent = al.Where(x => x.PersonId == p.RelativeId && x.RelativeId == p.PersonId).SingleOrDefault().Relation
 }
 ).ToList();

3
你可以使用排序好的 Tuple 中的 PersonIdRelativeId 对关系进行分组,然后选择第一个项目作为第一关系,第二个项目作为反向关系。

演示:

using System;
using System.Collections.Generic;
using System.Linq;

namespace Example {

    public static class Program {

        public static void Main (string[] args) {

            List<RelationDTO> relationDTOList = new List<RelationDTO> {
                new RelationDTO { PersonId = 1, RelativeId = 2, Relation = "Son" },
                new RelationDTO { PersonId = 2, RelativeId = 1, Relation = "Father" },

                new RelationDTO { PersonId = 1, RelativeId = 3, Relation = "Mother" },
                new RelationDTO { PersonId = 3, RelativeId = 1, Relation = "Son" },

                new RelationDTO { PersonId = 2, RelativeId = 3, Relation = "Husband" },
                new RelationDTO { PersonId = 3, RelativeId = 2, Relation = "Wife" },
            };

            // Group relations into list of lists
            var groups = relationDTOList
                .GroupBy (r => GetOrderedTuple (r.PersonId, r.RelativeId))
                .Select (grp => grp.ToList ()).ToList ();

            // Output original relations and their reverse relations
            foreach (var group in groups) {
                var relation = group.ElementAt (0);
                var reverseRelation = group.ElementAt (1);
                FormattableString relationOutput = $"PersonId={relation.PersonId} RelativeId={relation.RelativeId} Relation={relation.Relation} ReverseRelation={reverseRelation.Relation}";
                Console.WriteLine (relationOutput);
            }
        }

        private static Tuple<int, int> GetOrderedTuple (int n1, int n2) {
            if (n1 < n2) {
                return Tuple.Create (n1, n2);
            }
            return Tuple.Create (n2, n1);
        }
    }
}

输出:

PersonId=1 RelativeId=2 Relation=Son ReverseRelation=Father
PersonId=1 RelativeId=3 Relation=Mother ReverseRelation=Son
PersonId=2 RelativeId=3 Relation=Husband ReverseRelation=Wife

1
这样做可以实现目标,但需要在原始列表中包含重复项。
var result = relationDTOList
                .Where(v => v.PersonId < v.RelativeId)
                .GroupJoin(relationDTOList,
                           p => p.PersonId,
                           a => a.RelativeId,
                           (p, al) =>
                                new{
                                    p.PersonId,
                                    p.RelativeId,
                                    p.Relation,
                                    ReverseRelation = al.Where( x => 
                                              x.PersonId == p.RelativeId &&
                                              x.RelativeId == p.PersonId )
                                                .SingleOrDefault()
                                                .Relation} ).ToList();

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接