通过选择特定行生成频率表

Question

通过选择特定行生成频率表

6

我有一个数据集D的最小示例，看起来像这样:

 score person freq
    10      1    3
    10      2    5
    10      3    4
     8      1    3
     7      2    2
     6      4    1

现在，我希望能够根据人员绘制得分为10的频率图，但是如果我这样做：

#My bad, turns out the next line only works for matrices anyway:
#D = D[which(D[,1] == 10)]

D = subset(D, score == 10)

然后我获得：

score person freq
   10      1    3
   10      2    5
   10      3    4

然而，这是我想要得到的：

但是，这是我希望获得的：

score person freq
   10      1    3
   10      2    5
   10      3    4
   10      4    0

有没有一种在R中快速且不费力的方法来做这件事？

- SamTheTomato

3个回答

4

您可以使用tidyr包中的complete()函数创建缺失的行，然后可以简单地进行子集操作：

library(tidyr)
D2 <- complete(D, score, person, fill = list(freq = 0))
D2[D2$score == 10, ]
## Source: local data frame [4 x 3]
## 
##   score person  freq
##   (int)  (int) (dbl)
## 1    10      1     3
## 2    10      2     5
## 3    10      3     4
## 4    10      4     0

complete()函数的第一个参数是它应该使用的数据框。然后是应该完成的列的名称。参数fill是一个列表，给出了对于剩余的每一列（这里仅为freq），它们应该填充的值。

如docendo-discimus所建议的那样，可以通过以下方式进一步简化，即使用dplyr包：

library(tidyr)
library(dplyr)
complete(D, score, person, fill = list(freq = 0)) %>% filter(score == 10)

- Stibu

1

或者使用dplyr管道 complete(df, score, person, fill = list(freq = 0)) %>% filter(score == 10) - talat

0

这是一个使用 dplyr 的方法：

D %>%   mutate(freq = ifelse(score == 10, freq, 0),
               score = 10) %>%
        group_by(score, person) %>%
        summarise(freq = max(freq))

Source: local data frame [4 x 3]
Groups: score [?]

  score person  freq
  (dbl)  (int) (dbl)
1    10      1     3
2    10      2     5
3    10      3     4
4    10      4     0

- SabDeM

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- talat · Accepted Answer

以下是基于R语言的方法：

subset(as.data.frame(xtabs(freq ~ score + person, df)), score == 10)
#   score person Freq
#4     10      1    3
#8     10      2    5
#12    10      3    4
#16    10      4    0