我有一个数据清洗问题。数据收集发生了三次,有时数据输入是不正确的。因此,如果学生的数据被收集了多次,则需要复制第二个
数据点。
这是我的数据集的样子:
df <- data.frame(id = c(1,1,1, 2,2,2, 3,3, 4,4, 5),
text = c("female","male","male", "female","female","female", "male","female","male", "female", "female"),
time = c("first","second","third", "first","second","third", "first","second","second", "third", "first"))
> df
id text time
1 1 female first
2 1 male second
3 1 male third
4 2 female first
5 2 female second
6 2 female third
7 3 male first
8 3 female second
9 4 male second
10 4 female third
11 5 female first
因此ID为1、3和4的性别信息不正确。当有多个/不同的输入关于“性别”变量时,我需要复制“第二个”数据点。如果只有一个数据点,则应该保留在数据集中。
所需输出如下:
> df1
id text time
1 1 male first
2 1 male second
3 1 male third
4 2 female first
5 2 female second
6 2 female third
7 3 female first
8 3 female second
9 4 male second
10 4 male third
11 5 female first
有什么建议吗?谢谢!