根据条件汇总计数

Question

根据条件汇总计数

4

我有一个数据框，其中包含多年中多个物种的象限计数，但有时它们标记为“p”表示“存在”。我想在平均计算时将它们视为NA来计算平均值，但同时还要跟踪每个物种/年份中 p 的数量。因此我的问题是，是否有一种方法可以使用summarize(count)来计算P的出现次数？

最简示例：

df <- data.frame(
  # years
  year = rep(1990:1992, each=3),
  # character vector of counts and p's
  count = c("p","p","2","1","5","4","7","p","4")
) %>%
  # numeric column of counts and NAs where P's should be
  mutate(count_numeric = as.numeric(count))


# summarize dataset
df %>%
  group_by(year) %>%
  summarize(number_quadrats = n(), # find total number of rows
            average_count = mean(count_numeric, na.rm=T)) # find average value

但我想在总结中再加一行，只需计算每个组中 P 的数量。就像这样：

df %>%
  group_by(year) %>%
  summarize(number_quadrats = n(), # find total number of rows
            average_count = mean(count_numeric, na.rm=T),# find average value
            number_p = n(count == "p"))

但是那样行不通。

欢迎任何建议。

谢谢！

- Jake L

2个回答

1

只需更改最后一行：

df %>%
group_by(year) %>%
summarize(number_quadrats = n(), # find total number of rows
          average_count = mean(count_numeric, na.rm=T),# find average value
          number_p = sum(count == "p"))

通过对布尔向量求和，您实际上是在计算满足条件的次数。

- ErrorJordan

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- TarJae · Accepted Answer

像这样的东西！

df %>%
  group_by(year) %>%
  summarize(N = n(), number_quadrats = sum(count == 'p'),
            average_count = mean(count_numeric, na.rm=T))

  year     N number_quadrats average_count
  <int> <int>           <int>         <dbl>
1  1990     3               2          2   
2  1991     3               0          3.33
3  1992     3               1          5.5