在缺失值的位置插入行

Question

在缺失值的位置插入行

4

数据就像：

 quarter name  week  value
 17Q3    abc   1     0.7
 17Q3    abc   3     0.65
 17Q3    def   1     0.13
 17Q3    def   2     0.04

我能在缺失值的地方插入值为0的行吗，比如输出结果应该是这样的：

quarter name  week  value
 17Q3    abc   1     0.7
 17Q3    abc   3     0.65
 17Q3    abc   2     0.0
 17Q3    def   1     0.13
 17Q3    def   2     0.04
 17Q3    def   3     0.0

需要填写至第13周。(即检查至第13周)

- umakant

2

尝试使用

library(dplyr);df1 %>% complete(quarter, name, week = full_seq(week, 1), fill = list(value = 0)) %>% arrange(quarter, name, id) %>% mutate(id = row_number()) %>% select(names(df1))

进行编程。 - akrun

谢谢。我已经尝试使用library(dplyr);df1 %>% complete(quarter, name, week = full_seq(week, 1), fill = list(value = 0))，但是出现了错误，提示找不到complete函数。 - umakant

complete 函数来自于 tidyr 包。抱歉。 - akrun

非常感谢，它起了神奇的作用，问题已解决。 - umakant

实际上，我没有id列。在上述代码之后，我得到了重复的行。使用库(tidyr)；df1 %>% complete(quarter, name, week = full_seq(week, 1), fill = list(value = 0))。 - umakant

显示剩余2条评论

2个回答

0

这里有一个使用tidyverse的选项。我们使用complete获取缺失的行组合，根据'quarter'、'name'和'id'进行arrange排序，然后将'id'变为'row_number()'并mutate，最后select选择列以与原始数据集具有相同的顺序。

library(tidyverse)
df1 %>%
  complete(quarter, name, week = full_seq(week, 1), fill = list(value = 0)) %>%
  arrange(quarter, name, id) %>%
  mutate(id = row_number()) %>% 
  select(names(df1))
# A tibble: 6 x 5
#     id quarter name   week  value
#  <int> <chr>   <chr> <dbl>  <dbl>
#1     1 17Q3    abc    1.00 0.700 
#2     2 17Q3    abc    3.00 0.650 
#3     3 17Q3    abc    2.00 0     
#4     4 17Q3    def    1.00 0.130 
#5     5 17Q3    def    2.00 0.0400
#6     6 17Q3    def    3.00 0

- akrun

谢谢Akrun。如果我们没有ID列，有没有办法呢？我得到了重复的行。一种方法是删除重复项。 - umakant

另一个更新。我需要填充到第13周。添加full_seq（week，1，13）没有帮助。 - umakant

@buntysahoo 请在帖子中更新您的输入数据和期望输出。 - akrun

已更新了文章本身。我需要检查到第13周并填写到第13周。 - umakant

问题已解决...我使用了unique函数，它给出了精确的解决方案。非常感谢Akrun。 - umakant

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Joe · Accepted Answer

如何在complete中使用expand？

library(tidyverse)
complete(df, expand(df, quarter, name, week), fill = list(value=0))

#   quarter name   week  value
#   <fct>   <fct> <int>  <dbl>
# 1 17Q3    abc       1 0.700 
# 2 17Q3    abc       2 0     
# 3 17Q3    abc       3 0.650 
# 4 17Q3    def       1 0.130 
# 5 17Q3    def       2 0.0400
# 6 17Q3    def       3 0

或许更容易理解的方式是：

或者，也许更容易理解：

df %>% expand(quarter, name, week) %>% left_join(df) %>% replace_na(list(value=0))