Powerquery:展开所有具有记录的列

4
使用Microsoft Excel 2013中的Power Query,我创建了一个如下所示的表格:
// To insert this in Power Query, append a '=' before the 'Table.FromRows'
Table.FromRows(
  {
    {"0", "Tom", "null", "null"},
    {"1", "Bob", [ name="Berlin" , street="BarStreet" ], [ name="Mary", age=25 ]},
    {"2", "Jim", [ name="Hamburg", street="FooStreet" ], [ name="Marta", age=30 ]}
  },
  {"ID", "Name", "Address", "Wife"}
)

现在,我想通过在两个记录上使用name属性来扩展AddressWife列。手动操作的话,我会像这样做:
// To insert this in Power Query, append a '=' before the 'Table.FromRows'
let
  t = Table.FromRows(
    {
      {"0", "Tom", "null", "null"},
      {"1", "Bob", [ name="Berlin" , street="BarStreet" ], [ name="Mary", age=25 ]},
      {"2", "Jim", [ name="Hamburg", street="FooStreet" ], [ name="Marta", age=30 ]}
    },
    {"ID", "Name", "Address", "Wife"}
  ),
  expAddress = Table.ExpandRecordColumn(t, "Address", {"name"}, {"Address → name"}),
  expWife = Table.ExpandRecordColumn(expAddress, "Wife", {"name"}, {"Wife → name"})
in
  expWife

背景

每当我有数据表格的布局不同,我就需要重新编写查询。在一个幻想世界中,您可以使用特定的键扩展所有具有记录的列。 理想情况下,您将拥有以下库函数:

// Returns a list with the names of the columns that match the secified type.
// Will also try to infer the type of a column if the table is untyped.
Table.ColumnsOfTypeInfer(
  table as table,
  listOfTypes as list
) as list

// Expands a column of records into columns with each of the values.
Table.ExpandRecordColumnByKey(
  table as table,
  columns as list,
  key as text,
) as table

然后,我可以调用
// To insert this in Power Query, append a '=' before the 'Table.FromRows'
let
  t = Table.FromRows(
    {
      {"0", "Tom", "null", "null"},
      {"1", "Bob", [ name="Berlin" , street="BarStreet" ], [ name="Mary", age=25 ]},
      {"2", "Jim", [ name="Hamburg", street="FooStreet" ], [ name="Marta", age=30 ]}
    },
    {"ID", "Name", "Address", "Wife"}
  ),
  recordColumns = Table.ColumnsOfTypeInfer(t, {type record}),
  expAll = Table.ExpandRecordColumnByKey(t, recordColumns, "name")
in
  expAll

问题

  1. 您能否获取具有特定类型的列列表,该类型未在表中指定,也就是推断出它?
  2. 您能使记录扩展通用吗?

编辑:添加了带有两个空单元格的行#0。

2个回答

3

首先感谢清晰的解释、示例数据和建议!

1) M 代码中无法进行类型推断。这个限制几乎可以被视为一种“特性”,因为如果源数据在某种方式下发生变化,导致推断类型不同,那么它几乎肯定会破坏您的查询。

加载未经过类型处理的数据后,使用检测数据类型按钮快速生成M,或者如果您正在读取来自JSON的数据,则应该已经足够有序了。

如果您有一个特定的场景无法正常工作,请更新您的问题:)

2) 只要表格的单元格值是记录类型,那么使记录扩展成通用类型就很容易,只需要稍微费点事情。此操作将查找所有行都是null或记录的列,并扩展name列。

下面是一些简单的实现,您可以添加到您的库中:

let
  t = Table.FromRows(
    {
      {"0", "Tom", null, null},
      {"1", "Bob", [ name="Berlin" , street="BarStreet" ], [ name="Mary", age=25 ]},
      {"2", "Jim", [ name="Hamburg", street="FooStreet" ], [ name="Marta", age=30 ]}
    },
    {"ID", "Name", "Address", "Wife"}
  ),

  Table.ColumnsOfAllRowType = (table as table, typ as type) as list => let
    ColumnNames = Table.ColumnNames(table),
    ColumnsOfType = List.Select(ColumnNames, (name) => 
      List.AllTrue(List.Transform(Table.Column(table, name), (cell) => Type.Is(Value.Type(cell), typ))))
  in
    ColumnsOfType,

  Table.ExpandRecordColumnByKey = (table as table, columns as list, key as text) as table  => 
    List.Accumulate(columns, table, (state, columnToExpand) => 
      Table.ExpandRecordColumn(state, columnToExpand, {key}, { columnToExpand & " → " & key })),

  recordColumns = Table.ColumnsOfAllRowType(t, type nullable record),
  expAll = Table.ExpandRecordColumnByKey(t, recordColumns, "name")
in
  expAll

如果一个新的库函数可以在 M 中实现,我们就不太可能将其添加到我们的标准库中,但如果您觉得它确实缺失了,请随时在以下网址提出建议:https://ideas.powerbi.com/forums/265200-power-bi/

您可能有理由添加类似 Table.ReplaceTypeFromFirstRow(table as table) as table 的函数,因为使用 M 构造类型非常混乱。


谢谢您的回答。然而,我们不能假设第一行有记录,并且可能存在空值。我在这方面更新了我的例子。@Carl Walsh - Jonas Gröger

2
抱歉回复有点晚,但我刚遇到了类似的挑战。我尝试使用Chris Webb的ExpandAll函数:http://blog.crossjoin.co.uk/2014/05/21/expanding-all-columns-in-a-table-in-power-query/,但它只适用于表类型列,而不是记录类型列,但我已经成功地将其修改为此目的。我将Chris的函数复制了一份并命名为“ExpandAllRecords”,并进行了3个编辑:

  1. each if _ is table then Table.ColumnNames(_)替换为each if _ is record then Record.FieldNames(_)
  2. Table.ExpandTableColumn替换为Table.ExpandRecordColumn
  3. ExpandAll替换为ExpandAllRecords

我尝试编写一个函数来同时扩展表格和记录,但我一直收到类型错误。

无论如何,有了这个,最终查询就是:

let
    t = Table.FromRows(
        {
            {"1", "Tom", null, [ name="Jane", age=35 ]},
            {"2", "Bob", [ name="Berlin" , street="BarStreet" ], [ name="Mary", age=25 ]},
            {"3", "Jim", [ name="Hamburg", street="FooStreet" ], [ name="Marta", age=30 ]}
        },
        {"ID", "Name", "Address", "Wife"}
    ),
    Output = ExpandAllRecords(t)
in
    Output

编辑:

考虑到有一天 Chris Webb(由 @MikeHoney 提到)的伟大代码片段可能会消失,我在此镜像整个代码:

let
     //Define function taking two parameters - a table and an optional column number 
     Source = (TableToExpand as table, optional ColumnNumber as number) =>
     let
      //If the column number is missing, make it 0
      ActualColumnNumber = if (ColumnNumber=null) then 0 else ColumnNumber,

      //Find the column name relating to the column number
      ColumnName = Table.ColumnNames(TableToExpand){ActualColumnNumber},

      //Get a list containing all of the values in the column
      ColumnContents = Table.Column(TableToExpand, ColumnName),

      //Iterate over each value in the column and then
      //If the value is of type table get a list of all of the columns in the table
      //Then get a distinct list of all of these column names
      ColumnsToExpand = List.Distinct(List.Combine(List.Transform(ColumnContents, 
                         each if _ is table then Table.ColumnNames(_) else {}))),

      //Append the original column name to the front of each of these column names
      NewColumnNames = List.Transform(ColumnsToExpand, each ColumnName & "." & _),

      //Is there anything to expand in this column?
      CanExpandCurrentColumn = List.Count(ColumnsToExpand)>0,

      //If this column can be expanded, then expand it
      ExpandedTable = if CanExpandCurrentColumn 
                          then 
                          Table.ExpandTableColumn(TableToExpand, ColumnName, 
                                 ColumnsToExpand, NewColumnNames) 
                          else 
                          TableToExpand,
      //If the column has been expanded then keep the column number the same, otherwise add one to it
      NextColumnNumber = if CanExpandCurrentColumn then ActualColumnNumber else ActualColumnNumber+1,

      //If the column number is now greater than the number of columns in the table
      //Then return the table as it is
      //Else call the ExpandAll function recursively with the expanded table
      OutputTable = if NextColumnNumber>(Table.ColumnCount(ExpandedTable)-1) 
                         then 
                         ExpandedTable 
                         else 
                         ExpandAll(ExpandedTable, NextColumnNumber)
     in
     OutputTable
in
     Source

You can then use this function on the XML file as follows:

let
     //Load XML file
     Source = Xml.Tables(File.Contents("C:\Users\Chris\Documents\PQ XML Expand All Demo.xml")),
     ChangedType = Table.TransformColumnTypes(Source,{{"companyname", type text}}),

     //Call the ExpandAll function to expand all columns
     Output = ExpandAll(ChangedType)
in
     Output  

(Source and downloadable example: Chris Webb's Bi Blog, 2014-05-21)


Chris Webb的函数扩展表格类型列的Gist:https://gist.github.com/Mike-Honey/f5f3f708310eb1de7f4a - Mike Honey
我对扩展记录类型列的调整的要点:https://gist.github.com/Mike-Honey/0a252edf66c3c486b69b - Mike Honey

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接