我有一个数据框,看起来就像这样(见链接)。我想要进一步处理下面生成的输出,通过将tone变量分布到n和average变量中。似乎这个主题可能与此相关,但我无法使其工作:
Is it possible to use spread on multiple columns in tidyr similar to dcast?
我希望最终的表格中,源变量在一列中,然后是tone-n和tone-avg变量所在的列。因此,我希望列标题为“source” - “For - n” - “Against - n” “For -Avg” - “Against - Avg”。这是为了出版,而不是为了进一步计算,所以它关乎如何呈现数据。我认为以这种方式呈现数据更加直观。谢谢。
#variable1
Politician.For<-sample(seq(0,4,1),50, replace=TRUE)
#variable2
Politician.Against<-sample(seq(0,4,1),50, replace=TRUE)
#Variable3
Activist.For<-sample(seq(0,4,1),50,replace=TRUE)
#variable4
Activist.Against<-sample(seq(0,4,1),50,replace=TRUE)
#dataframe
df<-data.frame(Politician.For, Politician.Against, Activist.For,Activist.Against)
#tidyr
df %>%
#Gather all columns
gather(df) %>%
#separate by the period character
#(default separation character is non-alpha numeric characterr)
separate(col=df, into=c('source', 'tone')) %>%
#group by both source and tone
group_by(source,tone) %>%
#summarise to create counts and average
summarise(n=sum(value), avg=mean(value)) %>%
#try to spread
spread(tone, c('n', 'value'))