根据新列扩展数据框

Question

根据新列扩展数据框

3

我可以帮助您进行翻译。以下是数据框的内容：

我有以下的数据框：

df <- data.frame(
     month=c("July", "August", "August"),
     day=c(31, 1, 2),
     time=c(12, 12, 12))

   month day time
1   July  31   12
2 August   1   12
3 August   2   12

我有一个时间文本文件（以十进制格式表示），我想用文本文件中的所有时间替换“时间”列。文本文件中有多个日期，每个日期有超过300条记录。

7-31-2016 #the days are all concatenated together, this line represents the beginning of one day (July 31)
13.12344
13.66445
13.76892
...
8-1-2016 #here is another day (August 1)
14.50333
14.52000
14.53639
...

然而，文本文件比当前数据框要长得多 -- 它有393条记录。因此，我希望生成的数据框看起来像这样：

    month   day       time
5    July    31   13.12344
6    July    31   13.66445
7    July    31   13.76892
.....
393 August    1   14.50333
394 August    1   14.52000
394 August    1   14.53639

基本上，我只需要能够扩展当前的数据框以匹配新文件的记录数，同时保持相同的日期。希望这样说得清楚明白。

- ale19

1

你的文本文件结构怎么样？ - Nico Coallier

请提供从文本文件中读取的数据框或列表。 - OmaymaS

@NicoCoallier 文本文件的结构与我在帖子中列出的完全相同。它基本上只是一系列连接在一起的时间列表。日期表示新的一天（例如7月31日，8月1日等）。 - ale19

@OmaymaS 我不确定你想要什么？由于文本文件和现有数据框架都非常大，所以我只提供了一个样例。 - ale19

我有点困惑你遇到了什么问题。看起来你只是想要将两个数据框格式化为具有月份和日期列，并在这些列上合并它们。你是否在格式化第二个数据框时遇到了问题？ - svenhalvorson

第一步是清理文本文件

txt <- data.frame(value = c('7-31-2016', '13.12344', '13.66445', '13.76892', '8-1-2016', '14.50333', '14.52000', '14.53639'))

，然后像这样做...txt$dash <- grepl('-', txt$value) 但我不知道如何根据该逻辑值扩展数据框，以便您有一个 date 和 time 列... 然后只需将其连接到您的数据框即可。 - pyll

3个回答

0

将 txt 文件转换成可合并的 dataframe:

 df$V2=as.numeric(df$V1)
 Temp=is.na(df$V2)
 df$V2=NA
 df$V2[Temp]=df$V1[Temp]
 df$V2=na.locf(df$V2)
 df=df[!Temp,]

        V1        V2
2 13.12344 7/31/2016
3 13.66445 7/31/2016
4 13.76892 7/31/2016
6 14.50333  8/1/2016
7    14.52  8/1/2016
8 14.53639  8/1/2016

- BENY

0

所以你想要将已有的数据框 df（只有3行）与包含许多行的 new_text 合并。使用以下代码：

merge(df, new_text, all.y = T) #all.y will interpolate new rows for the ones that don't match

更多信息请参见?merge。

- Matt

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- pyll · Accepted Answer

# Create txt data
txt <- data.frame(x = c('7-31-2016', '13.12344', '13.66445', '13.76892', '8-1-2016', '14.50333', '14.52000', '14.53639'))
# Load Your data 
df <- data.frame(
  month=c("July", "August", "August"),
  day=c(31, 1, 2),
  time=c(12, 12, 12))

# Need a year to join dates
df$year <- 2016

# Create date column
df$date <- as.Date(paste0(df$month, "/", df$day, "/", df$year), format = "%B/%d/%Y")

# Find values with dashes, then replaces with /
txt$dash <- grepl('-', txt$x)
txt$x <- gsub("-", "/", txt$x)

# Adds new columns
library(dplyr)
txt <- mutate(txt, date = ifelse(dash==TRUE, as.Date(x, format = "%m/%d/%Y"), NA))
txt <- mutate(txt, time = ifelse(dash==FALSE, as.numeric(x), NA))

# Fill down values
library(zoo)
txt$date <- na.locf(txt$date)

# Removes NA and keeps necessary columns
txt <- txt[!is.na(txt$time),]
txt <- txt[c("date", "time")]

# Merge
output <- merge(df, txt, by = "date")