我在`lubridate`包中发现了一种奇怪的行为:`dmy(NA)`会引发错误而不是只返回NA。当我想要转换一个列,其中一些元素是NA而其他日期字符串通常可以轻松转换时,这会给我带来问题。
下面是一个最小示例:
总结一下,我有两个问题:1)为什么dmy(NA)不起作用?根据大多数其他函数的表现,我会认为每次转换(例如dmy())都应该返回NA(就像2 + NA一样),这是良好的编程实践。如果这是预期的行为,那么如何通过dmy()函数转换包含NA的data.frame列呢?
下面是一个最小示例:
library(lubridate)
df <- data.frame(ID=letters[1:5],
Datum=c("01.01.1990", NA, "11.01.1990", NA, "01.02.1990"))
df_copy <- df
#Question 1: Why does dmy(NA) not return NA, but throws an error?
df$Datum <- dmy(df$Datum)
Error in function (..., sep = " ", collapse = NULL) : invalid separator
df <- df_copy
#Question 2: What's a work around?
#1. Idea: Only convert those elements that are not NAs
#RHS works, but assigning that to the LHS doesn't work (Most likely problem::
#column "Datum" is still of class factor, while the RHS is of class POSIXct)
df[!is.na(df$Datum), "Datum"] <- dmy(df[!is.na(df$Datum), "Datum"])
Using date format %d.%m.%Y.
Warning message:
In `[<-.factor`(`*tmp*`, iseq, value = c(NA_integer_, NA_integer_, :
invalid factor level, NAs generated
df #Only NAs, apparently problem with class of column "Datum"
ID Datum
1 a <NA>
2 b <NA>
3 c <NA>
4 d <NA>
5 e <NA>
df <- df_copy
#2. Idea: Use mapply and apply dmy only to those elements that are not NA
df[, "Datum"] <- mapply(function(x) {if (is.na(x)) {
return(NA)
} else {
return(dmy(x))
}}, df$Datum)
df #Meaningless numbers returned instead of date-objects
ID Datum
1 a 631152000
2 b NA
3 c 632016000
4 d NA
5 e 633830400
总结一下,我有两个问题:1)为什么dmy(NA)不起作用?根据大多数其他函数的表现,我会认为每次转换(例如dmy())都应该返回NA(就像2 + NA一样),这是良好的编程实践。如果这是预期的行为,那么如何通过dmy()函数转换包含NA的data.frame列呢?
lubridate
无法正确解析NA
值:https://github.com/hadley/lubridate/issues/88 - Andrielubridate:::guess_format()
函数引起的。在调用paste()
时,将NA
传递给了sep
,具体地说是在fmts <- unlist(mlply(with_seps, paste))
这里。 - jthetzel