top_n和order在R中的区别

Question

top_n和order在R中的区别

5

我不太理解 dplyr 的 top_n 函数的输出。有人可以帮忙吗？

n=10

df = data.frame(ref=sample(letters,n),score=rnorm(n))

require(dplyr)

print(dplyr::top_n(df,5,score))

print(df[order(df$score,decreasing = T)[1:5],])

top_n 的输出结果不像我预期的那样按照分数进行排序。与使用 order 函数相比较。

 ref      score
1    i 0.71556494
2    p 0.04463846
3    v 0.37290990
4    g 1.53206194
5    f 0.86307107
   ref      score
7    g 1.53206194
10   f 0.86307107
1    i 0.71556494
6    v 0.37290990
4    p 0.04463846

我阅读的文档也暗示了 top_n 结果应该按照指定的列排序，例如： https://www.rstudio.com/wp-content/uploads/2015/02/data-wrangling-cheatsheet.pdf

- PM.

2

实际上结果是相同的，但使用 top_n 后，原始顺序中仅保留了5行。尝试：df %>% top_n(5) %>% arrange(desc(score)) - agenis

确实。top_n 相当于 filter(x, min_rank(desc(wt)) <= n)，不会对行进行排序，文档也没有建议这样做。 - Axeman

您IP地址为143.198.54.68，由于运营成本限制，当前对于免费用户的使用频率限制为每个IP每72小时10次对话，如需解除限制，请点击左下角设置图标按钮（手机用户先点击左上角菜单按钮）。 - PM.

我阅读了帮助文件，它提到的是选择发生的顺序，而不是重新排列行，即在wt上使用min_rank。我同意 cheatsheet 在这方面是错误的。 - Axeman

2个回答

0

我的误解和期望是由于我阅读了问题中链接的文档并在评论中描述。尽管有些文档声称，top_n 不会按 wt 排序生成输出。

- PM.

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Megatron · Accepted Answer

两种输出结果相同，但top_n不会重新排列行。

您可以使用arrange()获得与df[order(df$score,decreasing = T)[1:5],]相同的结果。

top_n(df, 5, score) %>% arrange(desc(score))

将顺序颠倒，df[order(df$score,decreasing = F)[1:5],]等同于top_n(df, -5, score) %>% arrange(score)。