dplyr::slice_max()中order_by参数的功能

Question

dplyr::slice_max()中order_by参数的功能

rdplyr

5

在 slice_min（）和{{link1：slice_max（）}}函数的文档中，它说order_by参数可以是变量或由变量组成的函数来排序。

什么是由变量组成的函数？在实际情况下如何应用？例如，它是否可用于提供分类值的自定义顺序？

我已经尝试了在网上搜寻相关信息，但没有结果，所以我转向您。谢谢。

- pyg

1个回答

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Gregor Thomas · Accepted Answer

“变量的函数”是指将数据框的列作为输入并返回数字结果（或至少具有“最大”值）的任何函数。以下是一些例子：

## get the row with the highest product of Sepal.Length and Sepal.Width
iris %>% slice_max(Sepal.Length * Sepal.Width)
## here we use the function `*` and the variables `Sepal.Length` and `Sepal.Width`

iris %>% slice_max(nchar(Species))
## get the rows with the longest species name 
## here we use the function `nchar` and the variable `Species`

比如说，它能被用来提供自定义的分类值吗？

通常情况下，如果你希望对分类变量进行自定义排序，我们使用 factor 并指定级别的顺序。是的，你可以在 slice_max 中使用它——最后一个因子级别被认为是最大值：

iris %>% slice_max(Species)
## defaults to alphabetical order - all virginica rows returned

iris %>% slice_max(factor(Species, levels = c("versicolor", "virginica", "setosa")))
## if we make "setosa" the last/max level than setosa rows will be returned