使用dplyr在R中重塑表格

3
欢迎在R中使用的小建议。我们有以下数据:
   City            Amount    Category
1  Los Angeles     100       Film
2  Los Angeles     200       Film
3  Los Angeles     400       Music 
4  Seattle         300       Coffee
5  Boston          600       Books
...

最终结果应该如下所示:

                        Film   Coffee   Books   ...
City  
Los Angeles, CA         Sum    Sum      Sum     Sum 
Seattle, WA             Sum    Sum      Sum     Sum 
Boston, MA              Sum    Sum      Sum     Sum  

我希望透过数据透视表,能够对每个城市的每个类别的"Amount"总值进行汇总,使得城市显示在左侧作为一列,所有类别则显示在上方作为一行。
尝试过:
data %>%                                            
  group_by(Location, Category) %>%
  summarise(Amount = sum(Amount))

这更像是哪一个

   City            Amount    Category
1  Los Angeles     300       Film
3  Los Angeles     400       Music 
4  Seattle         300       Coffee
5  Boston          600       Books

计算是正确的,但是按照描述,我们需要将城市和类别作为矩阵,并将每个金额的总和放在相应的单元格中。

谢谢你的帮助!

1个回答

4
你需要的是 tidyr::spread,用于将你的数据框从长格式转换为宽格式:
library(tidyverse)

# recreate the data
data <- tribble(
  ~City,             ~Amount,   ~Category,
  "Los Angeles",     100,       "Film",
  "Los Angeles",     200,       "Film",
  "Los Angeles",     400,       "Music", 
  "Seattle",         300,       "Coffee",
  "Boston",          600,       "Books"
)

# using your code to get the data in the long-format
data_long <- data %>% 
  group_by(City, Category) %>%
  summarise(Amount = sum(Amount))

data_long
#> # A tibble: 4 x 3
#> # Groups:   City [?]
#>          City Category Amount
#>         <chr>    <chr>  <dbl>
#> 1      Boston    Books    600
#> 2 Los Angeles     Film    300
#> 3 Los Angeles    Music    400
#> 4     Seattle   Coffee    300

# spread to wide using the tidyr-package (in tidyverse)
data_wide <- spread(data_long, key = "Category", value = "Amount", fill = 0)

data_wide
#> # A tibble: 3 x 5
#> # Groups:   City [3]
#>          City Books Coffee  Film Music
#> *       <chr> <dbl>  <dbl> <dbl> <dbl>
#> 1      Boston   600      0     0     0
#> 2 Los Angeles     0      0   300   400
#> 3     Seattle     0    300     0     0

走向矩阵

mat <- as.matrix(data_wide %>% ungroup %>% select(-City))
rownames(mat) <- data_wide$City

mat
#>             Books Coffee Film Music
#> Boston        600      0    0     0
#> Los Angeles     0      0  300   400
#> Seattle         0    300    0     0

str(mat)
#>  num [1:3, 1:4] 600 0 0 0 0 300 0 300 0 0 ...
#>  - attr(*, "dimnames")=List of 2
#>   ..$ : chr [1:3] "Boston" "Los Angeles" "Seattle"
#>   ..$ : chr [1:4] "Books" "Coffee" "Film" "Music"

还有一种方法可以将输出转换为具有数值总和的矩阵吗?那将是最后的挑战。 - Christopher
如果您不需要列名,可以跳过此步骤。 - David

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接