使用purrr::map进行递归函数调用

3
我有一个数据框中的列表,我想使用 purrr::map() 来测试是否有任何 NULL 元素,并将它们去掉。
虽然我可以使用 sapply 来做到这一点,但 map 不起作用。我阅读了 https://cran.r-project.org/web/packages/purrr/purrr.pdf,但我无法弄清楚我缺少了什么。
这是我的 sapply 代码 - 这很好地工作:
P_Trans<- P_Trans[!sapply(P_Trans$Group,is.null),] 

以下是我尝试过的几种 purrr::map 方法,但它们都无法正常工作。

这是我尝试过的四种方法:

a)

P_Trans %>% purrr::map(.,~is.null(Group))

b)

P_Trans %>% purrr::map(.,~is.null(.$Group))

c)

P_Trans %>% purrr::map(~is.null(.$Group))

d)

P_Trans %>% purrr::map(~is.null(Group))

请问有人能够纠正我的错误,并告诉我在上述四个选项中我做错了什么吗?


数据:

dput(P_Trans)

structure(list(TransactionID = c("a1", "a1", "a1", "a2", "a2", 
"a2", "a3", "a3", "a3", "a3", "a4", "a5", "a5", "a5", "a5", "a5", 
"a6", "a6", "a7"), ProductID = c("A", "B", "1", "C", "4", "5", 
"D", "C", "7", "8", "H", "1", "2", "3", "3", "1", "H", "15", 
"22"), ProductType = c(1, 1, 2, 1, 2, 2, 1, 1, 2, 2, 1, 2, 2, 
2, 2, 2, 1, 2, 3), Group = list(structure(list(Group = "Group1"), .Names = "Group", row.names = c(NA, 
-1L), class = c("tbl_df", "tbl", "data.frame")), structure(list(
    Group = "Group1"), .Names = "Group", row.names = c(NA, -1L
), class = c("tbl_df", "tbl", "data.frame")), structure(list(
    Group = "Group1"), .Names = "Group", row.names = c(NA, -1L
), class = c("tbl_df", "tbl", "data.frame")), structure(list(
    Group = c("Group2", "Group3")), .Names = "Group", row.names = c(NA, 
-2L), class = c("tbl_df", "tbl", "data.frame")), structure(list(
    Group = "Group2"), .Names = "Group", row.names = c(NA, -1L
), class = c("tbl_df", "tbl", "data.frame")), structure(list(
    Group = "Group2"), .Names = "Group", row.names = c(NA, -1L
), class = c("tbl_df", "tbl", "data.frame")), structure(list(
    Group = "Group3"), .Names = "Group", row.names = c(NA, -1L
), class = c("tbl_df", "tbl", "data.frame")), structure(list(
    Group = c("Group2", "Group3")), .Names = "Group", row.names = c(NA, 
-2L), class = c("tbl_df", "tbl", "data.frame")), structure(list(
    Group = "Group3"), .Names = "Group", row.names = c(NA, -1L
), class = c("tbl_df", "tbl", "data.frame")), structure(list(
    Group = "Group3"), .Names = "Group", row.names = c(NA, -1L
), class = c("tbl_df", "tbl", "data.frame")), structure(list(
    Group = "Group5"), .Names = "Group", row.names = c(NA, -1L
), class = c("tbl_df", "tbl", "data.frame")), structure(list(
    Group = "Group1"), .Names = "Group", row.names = c(NA, -1L
), class = c("tbl_df", "tbl", "data.frame")), structure(list(
    Group = "Group1"), .Names = "Group", row.names = c(NA, -1L
), class = c("tbl_df", "tbl", "data.frame")), structure(list(
    Group = "Group1"), .Names = "Group", row.names = c(NA, -1L
), class = c("tbl_df", "tbl", "data.frame")), structure(list(
    Group = "Group1"), .Names = "Group", row.names = c(NA, -1L
), class = c("tbl_df", "tbl", "data.frame")), structure(list(
    Group = "Group1"), .Names = "Group", row.names = c(NA, -1L
), class = c("tbl_df", "tbl", "data.frame")), structure(list(
    Group = "Group5"), .Names = "Group", row.names = c(NA, -1L
), class = c("tbl_df", "tbl", "data.frame")), structure(list(
    Group = "Group5"), .Names = "Group", row.names = c(NA, -1L
), class = c("tbl_df", "tbl", "data.frame")), NULL)), class = c("tbl_df", 
"tbl", "data.frame"), row.names = c(NA, -19L), .Names = c("TransactionID", 
"ProductID", "ProductType", "Group"))

1
P_Trans[!map_lgl(P_Trans$Group,is.null),]? - Nate
@NathanDay - 哇!谢谢你。这个很好用。不过,你能否请评论一下我的错误吗?我对dplyr还比较新,所以这会帮助我学习概念。感谢你的想法。 - watchtower
1
我只在有一个数据框列表需要迭代的时候使用map,通常是为了建模。默认情况下,map试图返回一个列表,使用像map-lgl这样的包装器只是将结果强制转换为类似于map(p_trans$Group, is.null) %>% unlist的向量形式,但仍然保留名称等内容 :) 希望这有所帮助,map有时仍然会让我感到困惑,我会回到xapply - Nate
1
为什么不直接使用purrr::discard()呢? - hrbrmstr
1个回答

1

使用你提供的所有解决方案:

  • 你正在循环遍历P_Trans的每一列(而不是每个项目/行)
  • 这些列是原子向量,列表(或数据框)有名称,原子向量没有名称。names(P_Trans[[1]]) # NULL
  • 你打算返回一个列表,而不是数据框,尽管在崩溃之前
  • a等同于c
  • b等同于d

a)P_Trans %>% purrr::map(.,~is.null(Group))

  • Group不存在,这里没有任何东西告诉我们应该在当前项目中查找它,更不用说在表格中了

b)P_Trans %>% purrr::map(.,~is.null(.$Group))

  • 你正在循环遍历4个原子向量,每次都在寻找一个名为Group的元素,但没有任何一个(甚至第四个),所以$ operator is invalid for atomic vectors
lmap可以帮助您循环遍历P_Trans的列作为长度为1的子列表,但是这种方法会崩溃,只有最后一项将具有名为Group的项目(names(P_Trans[4]) # "Group")。
您解决方案的map等效方式是P_Trans[!map_lgl(P_Trans$Group,is.null),],如@Nate所述,因为map_lgl旨在返回与您的sapply类似的logical向量:
其他获取所需内容的方法:
P_Trans %>% rowwise %>% filter(!is.null(Group))
P_Trans %>% filter(lengths(Group)!=0)
P_Trans[lengths(P_Trans$Group)!=0,]

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接