使用LINQ透视数据

27

我有一个包含枚举(TypeCode)和用户对象的项目集合,我需要对其进行展开以在网格中显示。很难解释,让我快速展示一个示例。

集合包含以下项目:

TypeCode | User 
---------------
1        | Don Smith  
1        | Mike Jones  
1        | James Ray  
2        | Tom Rizzo  
2        | Alex Homes  
3        | Andy Bates  
我需要输出为:

1          | 2          | 3  
Don Smith  | Tom Rizzo  | Andy Bates  
Mike Jones | Alex Homes |  
James Ray  |            |  

我尝试使用foreach来做这件事,但是因为我会在foreach中向集合插入新的项而导致错误,所以我无法这样做。

是否可以使用Linq以更简洁的方式完成?


1
源是什么类型的集合?即使您的foreach代码是错误的,您也可以发布它吗?这将让我们更好地了解问题。 - Matthew Flaschen
我不确定这是否会影响答案,但您需要什么类型的网格?是带有HTML表格的ASP.NET,带有GridView组件的WinForms,WPF,某些报告引擎...? - CoderDennis
5个回答

27

我不是说这是一种伟大的转型方式 - 但它确实是一种转型...

    // sample data
    var data = new[] {
        new { Foo = 1, Bar = "Don Smith"},
        new { Foo = 1, Bar = "Mike Jones"},
        new { Foo = 1, Bar = "James Ray"},
        new { Foo = 2, Bar = "Tom Rizzo"},
        new { Foo = 2, Bar = "Alex Homes"},
        new { Foo = 3, Bar = "Andy Bates"},
    };
    // group into columns, and select the rows per column
    var grps = from d in data
              group d by d.Foo
              into grp
              select new {
                  Foo = grp.Key,
                  Bars = grp.Select(d2 => d2.Bar).ToArray()
              };

    // find the total number of (data) rows
    int rows = grps.Max(grp => grp.Bars.Length);

    // output columns
    foreach (var grp in grps) {
        Console.Write(grp.Foo + "\t");
    }
    Console.WriteLine();
    // output data
    for (int i = 0; i < rows; i++) {
        foreach (var grp in grps) {
            Console.Write((i < grp.Bars.Length ? grp.Bars[i] : null) + "\t");
        }
        Console.WriteLine();
    }

速度确实会成为大型数据集上的问题,但正如所述,“这是一个关键点”。 - DJ van Wyk

13

Marc的回答给出了稀疏矩阵,无法直接输入到Grid中。 我尝试扩展来自Vasu提供的链接中的代码如下:

public static Dictionary<TKey1, Dictionary<TKey2, TValue>> Pivot3<TSource, TKey1, TKey2, TValue>(
    this IEnumerable<TSource> source
    , Func<TSource, TKey1> key1Selector
    , Func<TSource, TKey2> key2Selector
    , Func<IEnumerable<TSource>, TValue> aggregate)
{
    return source.GroupBy(key1Selector).Select(
        x => new
        {
            X = x.Key,
            Y = source.GroupBy(key2Selector).Select(
                z => new
                {
                    Z = z.Key,
                    V = aggregate(from item in source
                                  where key1Selector(item).Equals(x.Key)
                                  && key2Selector(item).Equals(z.Key)
                                  select item
                    )

                }
            ).ToDictionary(e => e.Z, o => o.V)
        }
    ).ToDictionary(e => e.X, o => o.Y);
} 
internal class Employee
{
    public string Name { get; set; }
    public string Department { get; set; }
    public string Function { get; set; }
    public decimal Salary { get; set; }
}
public void TestLinqExtenions()
{
    var l = new List<Employee>() {
    new Employee() { Name = "Fons", Department = "R&D", Function = "Trainer", Salary = 2000 },
    new Employee() { Name = "Jim", Department = "R&D", Function = "Trainer", Salary = 3000 },
    new Employee() { Name = "Ellen", Department = "Dev", Function = "Developer", Salary = 4000 },
    new Employee() { Name = "Mike", Department = "Dev", Function = "Consultant", Salary = 5000 },
    new Employee() { Name = "Jack", Department = "R&D", Function = "Developer", Salary = 6000 },
    new Employee() { Name = "Demy", Department = "Dev", Function = "Consultant", Salary = 2000 }};

    var result5 = l.Pivot3(emp => emp.Department, emp2 => emp2.Function, lst => lst.Sum(emp => emp.Salary));
    var result6 = l.Pivot3(emp => emp.Function, emp2 => emp2.Department, lst => lst.Count());
}

* 不过我不能说任何关于性能的事情。


我想进行一些数据透视操作,但你在这里提供的答案导致了另一个问题,而这个问题的答案则链接已失效。真是令人失望。 - Robert Synoradzki
@ensisNoctis 该页面已被缓存至此处:https://web.archive.org/web/20120901120648/http://www.extensionmethod.net/Details.aspx?ID=147 - user4864425
从他自己的博客:https://www.reflectionit.nl/Blog/2009/c-linq-pivot-function - user4864425

9
你可以使用 Linq 的 .ToLookup 方法来按你所需的方式进行分组。
var lookup = data.ToLookup(d => d.TypeCode, d => d.User);

接下来的任务就是将它处理成消费者能够理解的形式。例如:

//Warning: untested code
var enumerators = lookup.Select(g => g.GetEnumerator()).ToList();
int columns = enumerators.Count;
while(columns > 0)
{
  for(int i = 0; i < enumerators.Count; ++i)
  {
    var enumerator = enumerators[i];
    if(enumator == null) continue;
    if(!enumerator.MoveNext())
    { 
      --columns;
      enumerators[i] = null;
    }
  }
  yield return enumerators.Select(e => (e != null) ? e.Current : null);
}

把这个放在一个IEnumerable<>方法中,它将(可能)返回一个用户集合(行)的集合(列),其中插入一个空值的列没有数据。

2

我猜这和Marc的答案类似,但我会发一下因为我花了一些时间在上面。结果用" | "分隔,就像你的例子一样。它还使用了从LINQ查询返回的IGrouping<int, string>类型,当使用group by时,而不是构造一个新的匿名类型。这是经过测试的可行代码。

var Items = new[] {
    new { TypeCode = 1, UserName = "Don Smith"},
    new { TypeCode = 1, UserName = "Mike Jones"},
    new { TypeCode = 1, UserName = "James Ray"},
    new { TypeCode = 2, UserName = "Tom Rizzo"},
    new { TypeCode = 2, UserName = "Alex Homes"},
    new { TypeCode = 3, UserName = "Andy Bates"}
};
var Columns = from i in Items
              group i.UserName by i.TypeCode;
Dictionary<int, List<string>> Rows = new Dictionary<int, List<string>>();
int RowCount = Columns.Max(g => g.Count());
for (int i = 0; i <= RowCount; i++) // Row 0 is the header row.
{
    Rows.Add(i, new List<string>());
}
int RowIndex;
foreach (IGrouping<int, string> c in Columns)
{
    Rows[0].Add(c.Key.ToString());
    RowIndex = 1;
    foreach (string user in c)
    {
        Rows[RowIndex].Add(user);
        RowIndex++;
    }
    for (int r = RowIndex; r <= Columns.Count(); r++)
    {
        Rows[r].Add(string.Empty);
    }
}
foreach (List<string> row in Rows.Values)
{
    Console.WriteLine(row.Aggregate((current, next) => current + " | " + next));
}
Console.ReadLine();

我还使用以下输入进行了测试:

var Items = new[] {
    new { TypeCode = 1, UserName = "Don Smith"},
    new { TypeCode = 3, UserName = "Mike Jones"},
    new { TypeCode = 3, UserName = "James Ray"},
    new { TypeCode = 2, UserName = "Tom Rizzo"},
    new { TypeCode = 2, UserName = "Alex Homes"},
    new { TypeCode = 3, UserName = "Andy Bates"}
};

以下是翻译的结果:

以下结果显示第一列无需包含最长的列表。如有需要,您可以使用OrderBy按TypeCode对列进行排序。

1         | 3          | 2
Don Smith | Mike Jones | Tom Rizzo
          | James Ray  | Alex Homes
          | Andy Bates | 

1

@Sanjaya.Tio 我被你的回答吸引了,创建了这个适配器以最小化keySelector的执行。(未经测试)

public static Dictionary<TKey1, Dictionary<TKey2, TValue>> Pivot3<TSource, TKey1, TKey2, TValue>(
    this IEnumerable<TSource> source
    , Func<TSource, TKey1> key1Selector
    , Func<TSource, TKey2> key2Selector
    , Func<IEnumerable<TSource>, TValue> aggregate)
{
  var lookup = source.ToLookup(x => new {Key1 = key1Selector(x), Key2 = key2Selector(x)});

  List<TKey1> key1s = lookup.Select(g => g.Key.Key1).Distinct().ToList();
  List<TKey2> key2s = lookup.Select(g => g.Key.Key2).Distinct().ToList();

  var resultQuery =
    from key1 in key1s
    from key2 in key2s
    let lookupKey = new {Key1 = key1, Key2 = key2}
    let g = lookup[lookupKey]
    let resultValue = g.Any() ? aggregate(g) : default(TValue)
    select new {Key1 = key1, Key2 = key2, ResultValue = resultValue};

  Dictionary<TKey1, Dictionary<TKey2, TValue>> result = new Dictionary<TKey1, Dictionary<TKey2, TValue>>();
  foreach(var resultItem in resultQuery)
  {
    TKey1 key1 = resultItem.Key1;
    TKey2 key2 = resultItem.Key2;
    TValue resultValue = resultItem.ResultValue;

    if (!result.ContainsKey(key1))
    {
      result[key1] = new Dictionary<TKey2, TValue>();
    }
    var subDictionary = result[key1];
    subDictionary[key2] = resultValue; 
  }
  return result;
}

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接