按照两个组别汇总数据

Question

按照两个组别汇总数据

4

我希望能够对我的数据进行描述性统计，分为两类：1）“SampledSub”和“SampledLUL”，使用这里的一部分数据子集：

myData <- structure(list(SampledLUL = structure(c(12L, 12L, 9L, 9L, 9L, 
9L), .Label = c("citrus", "crop", "cypress swamp", "freshwater marsh and wet prairie", 
"hardwood swamp", "improved pasture", "mesic upland forest", "mixed wetland forest", 
"pineland", "rangeland", "shrub swamp", "urban", "xeric upland forest"), class = "factor"), 
SampledSub = structure(c(12L, 12L, 4L, 12L, 8L, 4L), .Label = c("Aqualf", "Aquent", 
"Aquept", "Aquod", "Aquoll", "Aquult", "Arent", "Orthod", "Psamment", "Saprist", "Udalf", 
"Udult"), class = "factor"), SOC = c(3.381524292, 6.345916406, 2.122765119, 2.188488973, 
6.980834272, 7.363643479)), 
.Names = c("SampledLUL", "SampledSub", "SOC"), row.names = c(NA, 6L), class = "data.frame")

我已使用此代码将其分为两组进行汇总：

group.test <- ddply(buffer, c("SampledSub", "SampledLUL"), summarise,
                   N    = length(SOC),
                   mean = mean(SOC),
                   sd   = sd(SOC),
                   se   = sd / sqrt(N) )

但输出表格包含组和摘要统计信息作为列。我该如何生成类似于下面所示的表格？在我的情况下，“Sampledsub”将是观察值，并且摘要统计信息将根据“SampledLUL”进行分组。

- derelict

1

你可以从library(tables)中查看?tabular。 - akrun

1

@akrun，根据我的尝试，那个包在R 3.2.0+上无法使用。 - Pierre L

@PierreLafortune 我还没有尝试过。 - akrun

1

这次的dput做得很棒！ - jeremycg

1个回答

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- jeremycg · Accepted Answer

你可以使用 tidyr 来完成这个任务（尽管输出的表格可能不会像上面那样美观）：

library(tidyr)
group.test %>% gather(variable, val, - SampledSub, -SampledLUL) %>%
               unite(newcol, SampledLUL, variable) %>%
               spread(newcol, val)

  SampledSub pineland_mean pineland_N pineland_sd pineland_se urban_mean urban_N urban_sd urban_se
1      Aquod      4.743204          2    3.705861    2.620439         NA      NA       NA       NA
2     Orthod      6.980834          1         NaN         NaN         NA      NA       NA       NA
3      Udult      2.188489          1         NaN         NaN    4.86372       2 2.096142 1.482196