将一列按组连接/粘贴并添加到原始数据中。

3

我有一个按“部门”分组的姓名数据框('Name'):

 Dept Date      Name            
----- --------- --------------- 
   30 07-DEC-02 Raphaely        
   30 18-MAY-03 Khoo            
   40 07-JUN-02 Mavris          
   50 01-MAY-03 Kaufling        
   50 14-JUL-03 Ladwig          
   70 07-JUN-02 Baer            
   90 13-JAN-01 De Haan
   90 17-JUN-03 King  
  100 16-AUG-02 Faviet
  100 17-AUG-02 Greenberg 
  110 07-JUN-02 Gietz           
  110 07-JUN-02 Higgins         

我想将“Name”列按“Dept”连接起来,并将结果添加到原始数据中。期望的结果是“Emp_list”列:

 Dept Date      Name            Emp_list
----- --------- --------------- ---------------------------------------------
   30 07-DEC-02 Raphaely        Raphaely; Khoo
   30 18-MAY-03 Khoo            Raphaely; Khoo
   40 07-JUN-02 Mavris          Mavris
   50 01-MAY-03 Kaufling        Kaufling; Ladwig
   50 14-JUL-03 Ladwig          Kaufling; Ladwig
   70 07-JUN-02 Baer            Baer
   90 13-JAN-01 De Haan         De Haan; King
   90 17-JUN-03 King            De Haan; King
  100 16-AUG-02 Faviet          Faviet; Greenberg
  100 17-AUG-02 Greenberg       Faviet; Greenberg
  110 07-JUN-02 Gietz           Gietz; Higgins
  110 07-JUN-02 Higgins         Gietz; Higgins

有什么建议吗?

可能是重复内容 从索引中查找重复数据并将其串在一起 - Thomas
同时,https://dev59.com/sGQn5IYBdhLWcg3wvpUV#16596601 - Thomas
1
@Thomas,为了维护OP的权益,那些是聚合类型问题(合并行),而这个不是。 - A5C1D2H2I1M1N2O1R2T1
2个回答

8
您可以使用复制粘贴
within(mydf, {
  Emp_list <- ave(Name, Dept, FUN = function(x) paste(x, collapse = "; "))
})
#   Dept      Date      Name          Emp_list
# 1    30 07-DEC-02  Raphaely    Raphaely; Khoo
# 2    30 18-MAY-03      Khoo    Raphaely; Khoo
# 3    40 07-JUN-02    Mavris            Mavris
# 4    50 01-MAY-03  Kaufling  Kaufling; Ladwig
# 5    50 14-JUL-03    Ladwig  Kaufling; Ladwig
# 6    70 07-JUN-02      Baer              Baer
# 7    90 13-JAN-01   De Haan     De Haan; King
# 8    90 17-JUN-03      King     De Haan; King
# 9   100 16-AUG-02    Faviet Faviet; Greenberg
# 10  100 17-AUG-02 Greenberg Faviet; Greenberg
# 11  110 07-JUN-02     Gietz    Gietz; Higgins
# 12  110 07-JUN-02   Higgins    Gietz; Higgins

1

或者 plyr:

gr<-read.csv("gr.csv")
require(plyr)
merge(gr,ddply(gr,.(Dept),summarise,Emp_List=paste0(Name,collapse="; ")),by="Dept")

Dept      Date      Name          Emp_List
1    30 07-DEC-02  Raphaely    Raphaely; Khoo
2    30 18-MAY-03      Khoo    Raphaely; Khoo
3    40 07-JUN-02    Mavris            Mavris
4    50 01-MAY-03  Kaufling  Kaufling; Ladwig
5    50 14-JUL-03    Ladwig  Kaufling; Ladwig
6    70 07-JUN-02      Baer              Baer
7    90 13-JAN-01   De Haan     De Haan; King
8    90 17-JUN-03      King     De Haan; King
9   100 16-AUG-02    Faviet Faviet; Greenberg
10  100 17-AUG-02 Greenberg Faviet; Greenberg
11  110 07-JUN-02     Gietz    Gietz; Higgins
12  110 07-JUN-02   Higgins    Gietz; Higgins

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接