在Excel中如何使用VBA进行融合/重塑?

19

我目前在适应一份新工作,大部分与同事共享的工作都是通过MS Excel进行。我经常使用数据透视表,因此需要"堆叠"数据,这正是我依赖于R中reshape2包中的melt()函数产生的输出。

有人能帮我开始编写一个VBA宏来完成这个任务吗?或者已经存在这样的宏了吗?

该宏的概述如下:

  1. 选择Excel工作簿中的一系列单元格。
  2. 启动“melt”宏。
  3. 宏将创建提示,“输入id列数”,您需要输入识别信息前面的列数(对于下面的示例R代码,为4)。
  4. 创建一个名为“melt”的新工作表,并将数据堆叠起来,并创建一个名为“variable”的新列,其值等于原始选择的数据列标题。

换句话说,输出结果将与在R中执行以下两行代码的输出完全相同:

require(reshape)
melt(your.unstacked.dataframe, id.vars = 1:4)

以下是一个例子:

# unstacked data
> df1
  Year Month Country  Sport No_wins No_losses High_score Total_games
2 2010     5     USA Soccer       4         3          5           9
3 2010     6     USA Soccer       5         3          4           8
4 2010     5     CAN Soccer       2         9          7          11
5 2010     6     CAN Soccer       4         8          4          13
6 2009     5     USA Soccer       8         1          4           9
7 2009     6     USA Soccer       0         0          3           2
8 2009     5     CAN Soccer       2         0          6           3
9 2009     6     CAN Soccer       3         0          8           3

# stacking the data
> require(reshape)
> melt(df1, id.vars=1:4)

  Year Month Country  Sport    variable value
1  2010     5     USA Soccer     No_wins     4
2  2010     6     USA Soccer     No_wins     5
3  2010     5     CAN Soccer     No_wins     2
4  2010     6     CAN Soccer     No_wins     4
5  2009     5     USA Soccer     No_wins     8
6  2009     6     USA Soccer     No_wins     0
7  2009     5     CAN Soccer     No_wins     2
8  2009     6     CAN Soccer     No_wins     3
9  2010     5     USA Soccer   No_losses     3
10 2010     6     USA Soccer   No_losses     3
11 2010     5     CAN Soccer   No_losses     9
12 2010     6     CAN Soccer   No_losses     8
13 2009     5     USA Soccer   No_losses     1
14 2009     6     USA Soccer   No_losses     0
15 2009     5     CAN Soccer   No_losses     0
16 2009     6     CAN Soccer   No_losses     0
17 2010     5     USA Soccer  High_score     5
18 2010     6     USA Soccer  High_score     4
19 2010     5     CAN Soccer  High_score     7
20 2010     6     CAN Soccer  High_score     4
21 2009     5     USA Soccer  High_score     4
22 2009     6     USA Soccer  High_score     3
23 2009     5     CAN Soccer  High_score     6
24 2009     6     CAN Soccer  High_score     8
25 2010     5     USA Soccer Total_games     9
26 2010     6     USA Soccer Total_games     8
27 2010     5     CAN Soccer Total_games    11
28 2010     6     CAN Soccer Total_games    13
29 2009     5     USA Soccer Total_games     9
30 2009     6     USA Soccer Total_games     2
31 2009     5     CAN Soccer Total_games     3
32 2009     6     CAN Soccer Total_games     3

4
我偏爱的完成此任务的方法是:1. 将 Excel 文件保存为 CSV 格式;2. 读取至 R 中并正常处理;3. 将 CSV 文件进行融合 / 重塑并重新导出;4. 在 Excel 中打开,就好像什么都没有发生过一样。 - Justin
1
有一个插件(RExcel)可以让你从Excel中调用R。 - Brian Diggs
是的,那正是我一直在做的(将单元格复制到剪贴板,在R中处理,输出为.csv)。然而,我想要制作一个可以与同事共享的解决方案,所以必须使用VBA。 - baha-kev
4个回答

21

我在我的博客上有两篇文章,介绍如何在Excel/VBA中完成此操作,并提供可用代码和可下载的工作簿:

http://yoursumbuddy.com/data-normalizer

http://yoursumbuddy.com/data-normalizer-the-sql/

这是代码:

'Arguments
'List: The range to be normalized.
'RepeatingColsCount: The number of columns, starting with the leftmost,
'   whose headings remain the same.
'NormalizedColHeader: The column header for the rolled-up category.
'DataColHeader: The column header for the normalized data.
'NewWorkbook: Put the sheet with the data in a new workbook?
'
'NOTE: The data must be in a contiguous range and the
'columns that will be repeated must be to the left,
'with the columns to be normalized to the right.

Sub NormalizeList(List As Excel.Range, RepeatingColsCount As Long, _
    NormalizedColHeader As String, DataColHeader As String, _
    Optional NewWorkbook As Boolean = False)

Dim FirstNormalizingCol As Long, NormalizingColsCount As Long
Dim ColsToRepeat As Excel.Range, ColsToNormalize As Excel.Range
Dim NormalizedRowsCount As Long
Dim RepeatingList() As String
Dim NormalizedList() As Variant
Dim ListIndex As Long, i As Long, j As Long
Dim wbSource As Excel.Workbook, wbTarget As Excel.Workbook
Dim wsTarget As Excel.Worksheet

With List
    'If the normalized list won't fit, you must quit.
   If .Rows.Count * (.Columns.Count - RepeatingColsCount) > .Parent.Rows.Count Then
        MsgBox "The normalized list will be too many rows.", _
               vbExclamation + vbOKOnly, "Sorry"
        Exit Sub
    End If

    'You have the range to be normalized and the count of leftmost rows to be repeated.
   'This section uses those arguments to set the two ranges to parse
   'and the two corresponding arrays to fill
   FirstNormalizingCol = RepeatingColsCount + 1
    NormalizingColsCount = .Columns.Count - RepeatingColsCount
    Set ColsToRepeat = .Cells(1).Resize(.Rows.Count, RepeatingColsCount)
    Set ColsToNormalize = .Cells(1, FirstNormalizingCol).Resize(.Rows.Count, NormalizingColsCount)
    NormalizedRowsCount = ColsToNormalize.Columns.Count * .Rows.Count
    ReDim RepeatingList(1 To NormalizedRowsCount, 1 To RepeatingColsCount)
    ReDim NormalizedList(1 To NormalizedRowsCount, 1 To 2)
End With

'Fill in every i elements of the repeating array with the repeating row labels.
For i = 1 To NormalizedRowsCount Step NormalizingColsCount
    ListIndex = ListIndex + 1
    For j = 1 To RepeatingColsCount
        RepeatingList(i, j) = List.Cells(ListIndex, j).Value2
    Next j
Next i

'We stepped over most rows above, so fill in other repeating array elements.
For i = 1 To NormalizedRowsCount
    For j = 1 To RepeatingColsCount
        If RepeatingList(i, j) = "" Then
            RepeatingList(i, j) = RepeatingList(i - 1, j)
        End If
    Next j
Next i

'Fill in each element of the first dimension of the normalizing array
'with the former column header (which is now another row label) and the data.
With ColsToNormalize
    For i = 1 To .Rows.Count
        For j = 1 To .Columns.Count
            NormalizedList(((i - 1) * NormalizingColsCount) + j, 1) = .Cells(1, j)
            NormalizedList(((i - 1) * NormalizingColsCount) + j, 2) = .Cells(i, j)
        Next j
    Next i
End With

'Put the normal data in the same workbook, or a new one.
If NewWorkbook Then
    Set wbTarget = Workbooks.Add
    Set wsTarget = wbTarget.Worksheets(1)
Else
    Set wbSource = List.Parent.Parent
    With wbSource.Worksheets
        Set wsTarget = .Add(after:=.Item(.Count))
    End With
End If

With wsTarget
    'Put the data from the two arrays in the new worksheet.
   .Range("A1").Resize(NormalizedRowsCount, RepeatingColsCount) = RepeatingList
    .Cells(1, FirstNormalizingCol).Resize(NormalizedRowsCount, 2) = NormalizedList

    'At this point there will be repeated header rows, so delete all but one.
   .Range("1:" & NormalizingColsCount - 1).EntireRow.Delete

    'Add the headers for the new label column and the data column.
   .Cells(1, FirstNormalizingCol).Value = NormalizedColHeader
    .Cells(1, FirstNormalizingCol + 1).Value = DataColHeader
End With
End Sub

您会这样调用它:
Sub TestIt()
NormalizeList ActiveSheet.UsedRange, 4, "Variable", "Value", False
End Sub

这很棒;然而我在选择有效范围时遇到了麻烦。 当我突出显示我要堆叠的单元格时,它会拉入额外的空白列(可能来自您的棒球示例)。 在运行宏之前如何界定相关范围? - baha-kev
1
你需要更改调用行为:NormalizeList Selection, 4, "Variable", "Value", False 或者 NormalizeList ActiveSheet.Range("A1:D100"), 4, "Variable", "Value", False 或者任何其他应该使用的范围。 - Doug Glancy

8

最近微软发布了Power Query,这是一个Excel插件,它为Excel中的数据操作添加了许多有趣的功能和能力,包括你正在寻找的内容。

在这个插件中实际的功能叫做“Unpivot Columns”,在这篇文章中有详细解释。以下是简要概述:

  1. 下载并安装插件
  2. 打开你的Excel/CSV文件
  3. 选择你想要融合/重塑的表格/范围
  4. 在“Power Query”选项卡中,点击“从表格中获取数据”,这将打开“查询编辑器”
  5. 选择你想要融合/重塑的列(ctrl或shift选择,不要拖动)
  6. 在“转换”选项卡中,点击“Unpivot Columns”(你也可以在这里应用其他转换,然后返回Excel)
  7. 在“主页”选项卡中,点击“关闭并加载”。这将在Excel中创建一个新的表格/查询对象,其中包含所需的结果。

1
或者使用:
Sub M_snb_000()
  With sheet1.Cells(1).CurrentRegion
    sn = .Resize(, .Columns.Count + 1)
  End With

  For j = 4 To UBound(sn, 2) - 1
    With Sheet2.Cells(2 + (UBound(sn) - 1) * (j - 4), 1)
       .Resize(UBound(sn) - 1, 5) = Application.Index(sn, Evaluate("row(2:" 
             & UBound(sn) & ")"), Array(1, 2, 3,UBound(sn, 2), j))
       .Resize(UBound(sn) - 1, 1).Offset(, 3) = sn(1, j)
    End With
  Next
End Sub

这似乎不是正确的答案,因为它不接受用于id列数的变量。 - KHeaney
1
@KHeaney 实际上,这个答案达到了Excel高手的水平。你试过了吗?对于这个特定的问题,只需要进行一个小改变...将 For j = 4... 这一行改为 For j = 5... 就可以完美地解决问题了。这是对Excel-VBA细微差别的精湛掌握。@snb,做得非常好! - Excel Hero
@ExcelHero 我没有测试过它,只是在Triage队列中快速评估了一下。老实说,它可能会很好地工作,因为它似乎会正确地迭代所需的范围。但正如我在评论中所说,我不认为它接受输入,也不知道用户的用例,所以我按照他们的问题陈述去做,即它需要用户的输入。 - KHeaney

0

首先创建一个用户窗体并将其命名为Unpivot_Form,其中包含两个RefEdit字段 - rng_id和value_id以及一个提交/执行按钮。我也是一个R用户,rng_id是包含id的范围,而value_id包含值; 两个范围都包括标题。

编写两个宏:

Sub unpivot()
Unpivot_Form.Show
End Sub

另一个宏位于字段的提交/前往按钮内:

Private Sub submit_Click()
'Code to unpivot (convert wide to long for excel)

Dim rng_id, rng_id_header, val_id As Range
Dim colvar, emptyrow, col As Integer
Dim new_sheet As Worksheet

'Put val_id range into a range object
Set val_id = Range(value_id.Value)

'Determine the parameter for the value id range
'This is used for the looping later on
numrows = val_id.Rows.Count
numcols = val_id.Columns.Count

'Resize changes the "block" to the size defined by the row and column
'Offset moves the "block"
Set rng_id_header = Range(range_id.Value).Resize(1)
Set rng_id = Range(range_id.Value).Offset(1, 0).Resize(numrows - 1)

Set new_sheet = Worksheets.Add

'Set up the first column and first batch of id vars
new_sheet.Activate
Range("A65535").End(xlUp).Activate
rng_id_header.Copy ActiveCell
colvar = Range("XFD1").End(xlToLeft).Column + 1
Range("XFD1").End(xlToLeft).Offset(, 1).Value = "Variable"
Range("XFD1").End(xlToLeft).Offset(, 1).Value = "Value"

'Start populating the value ids
For col = 1 To numcols

  'populate var_id
  'determine last row
   emptyrow = Range("A65535").End(xlUp).Row + 1
   'no need to activate to source to copy
   rng_id.Copy new_sheet.Cells(emptyrow, 1)
  'copy the variable
  val_id.Offset(, col - 1).Resize(1, 1).Copy new_sheet.Range(Cells(emptyrow, colvar), Cells(emptyrow + numrows - 2, colvar))
  'copy the value
  val_id.Offset(1, col - 1).Resize(numrows - 1, 1).Copy new_sheet.Range(Cells(emptyrow, colvar + 1), Cells(emptyrow + numrows - 2, colvar + 1))

Next

Unload Me

End Sub

享受吧!


网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接