Lubridate中as_date和as_datetime的行为差异

3

我有一个数字向量,表示自1970年1月1日以来的毫秒数。我想使用lubridate将它们转换为日期时间对象。以下是数据样本:

raw_times <- c(1139689917479, 1139667123031, 1140364113915, 1140364951003, 
               1139643685434, 1139677091970, 1139691963511, 1140339448413, 1140368308429, 
               1139686613641, 1139666081813, 1140351488730, 1140346617958, 1141933663183, 
               1141933207579, 1140360125149, 1140351845108, 1140365079103, 1141933549825, 
               1140365601476)

了解到as_dateas_datetime的文档说明它们接受一个数字向量,表示自1970年1月1日以来的天数,我尝试了以下内容:

library(lubridate)

as_date(raw_times / (1000 * 60 * 60 * 24))
"2006-02-11" "2006-02-11" "2006-02-19" "2006-02-19" "2006-02-11" 
"2006-02-11" "2006-02-11" "2006-02-19" "2006-02-19" "2006-02-11" 
"2006-02-11" "2006-02-19" "2006-02-19" "2006-03-09" "2006-03-09"
"2006-02-19" "2006-02-19" "2006-02-19" "2006-03-09" "2006-02-19"

(很明显,这里用到了一个事实,即一秒钟有1000毫秒,一分钟有60秒,一小时有60分钟,一天有24小时。)
(当我使用as_datetime运行相同的代码时,我得到以下结果:)
as_datetime(raw_times / (1000 * 60 * 60 * 24))
"1970-01-01 03:39:50 UTC" "1970-01-01 03:39:50 UTC" "1970-01-01 03:39:58 UTC" "1970-01-01 03:39:58 UTC" "1970-01-01 03:39:50 UTC" "1970-01-01 03:39:50 UTC"
"1970-01-01 03:39:50 UTC" "1970-01-01 03:39:58 UTC" "1970-01-01 03:39:58 UTC" "1970-01-01 03:39:50 UTC" "1970-01-01 03:39:50 UTC" "1970-01-01 03:39:58 UTC"
"1970-01-01 03:39:58 UTC" "1970-01-01 03:40:16 UTC" "1970-01-01 03:40:16 UTC" "1970-01-01 03:39:58 UTC" "1970-01-01 03:39:58 UTC" "1970-01-01 03:39:58 UTC"
"1970-01-01 03:40:16 UTC" "1970-01-01 03:39:58 UTC"

结果是不同的。我会假设有其他参数我没注意到,但我在文档中找不到任何告诉我那是什么的东西。
下面是会话信息:
> sessionInfo()
R version 3.3.2 (2016-10-31)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252    LC_MONETARY=English_United States.1252 LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] lubridate_1.6.0

loaded via a namespace (and not attached):
[1] magrittr_1.5  tools_3.3.2   stringi_1.1.2 stringr_1.1.0

2
如果你愿意采用一个“基本”解决方案,那么.POSIXct(raw_times/1000)就可以使用。 - Joshua Ulrich
as_datetime 是用于 POSIXct 的函数: https://github.com/hadley/lubridate/blob/ac5021716235c7aa29cad4761c429c4539d22ae4/NEWS.md - Hack-R
是的,我认为.POSIXct会起作用。谢谢。 - Nick Criswell
2个回答

5

虽然不是(包名已删除)的解决方案,但您可以使用base::.POSIXct来实现此操作:

R> options(digits.secs=3)
R> .POSIXct(raw_times/1000)
 [1] "2006-02-11 14:31:57.479 CST" "2006-02-11 08:12:03.030 CST"
 [3] "2006-02-19 09:48:33.914 CST" "2006-02-19 10:02:31.003 CST"
 [5] "2006-02-11 01:41:25.434 CST" "2006-02-11 10:58:11.970 CST"
 [7] "2006-02-11 15:06:03.510 CST" "2006-02-19 02:57:28.413 CST"
 [9] "2006-02-19 10:58:28.428 CST" "2006-02-11 13:36:53.641 CST"
[11] "2006-02-11 07:54:41.812 CST" "2006-02-19 06:18:08.730 CST"
[13] "2006-02-19 04:56:57.957 CST" "2006-03-09 13:47:43.183 CST"
[15] "2006-03-09 13:40:07.578 CST" "2006-02-19 08:42:05.148 CST"
[17] "2006-02-19 06:24:05.108 CST" "2006-02-19 10:04:39.102 CST"
[19] "2006-03-09 13:45:49.825 CST" "2006-02-19 10:13:21.476 CST"

4
另一种解决方案是使用相对较新的anytime包,其任务是将任何内容转换为适当的DatePOSIXct对象,输入最小化、处理简单。

anytime()函数也接受(已正确缩放的)自纪元以来的秒数:

R> raw_times <- c(1139689917479, 1139667123031, 1140364113915,
+                1140364951003, 1139643685434, 1139677091970,
+                1139691963511, 1140339448413, 1140368308429,
+                1139686613641, 1139666081813, 1140351488730,
+                1140346617958, 1141933663183, 1141933207579,
+                1140360125149, 1140351845108, 1140365079103,
+                1141933549825, 1140365601476)
R> scaled_times <- raw_times / 1000
R> library(anytime)
R> options(digits.secs=6)   # subsecond display
R> anytime(scaled_times)           
 [1] "2006-02-11 14:31:57.479 CST"
 [2] "2006-02-11 08:12:03.030 CST"
 [3] "2006-02-19 09:48:33.914 CST"
 [4] "2006-02-19 10:02:31.003 CST"
 [5] "2006-02-11 01:41:25.434 CST"
 [6] "2006-02-11 10:58:11.970 CST"
 [7] "2006-02-11 15:06:03.510 CST"
 [8] "2006-02-19 02:57:28.413 CST"
 [9] "2006-02-19 10:58:28.428 CST"
[10] "2006-02-11 13:36:53.641 CST"
[11] "2006-02-11 07:54:41.812 CST"
[12] "2006-02-19 06:18:08.730 CST"
[13] "2006-02-19 04:56:57.957 CST"
[14] "2006-03-09 13:47:43.183 CST"
[15] "2006-03-09 13:40:07.578 CST"
[16] "2006-02-19 08:42:05.148 CST"
[17] "2006-02-19 06:24:05.108 CST"
[18] "2006-02-19 10:04:39.102 CST"
[19] "2006-03-09 13:45:49.825 CST"
[20] "2006-02-19 10:13:21.476 CST"
R> 

使用anytime()有点过度(正如Josh所示),但是与其使用隐藏的基本函数,使用一个公开函数可能更可取。而且anytime()胜过官方的as.POSIXct(),因为它不需要原点(一次又一次地)。


网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接