我试图运行一些 R 代码,但由于内存原因而崩溃。我收到的错误信息是:
Error in sendMaster(try(lapply(X = S, FUN = FUN, ...), silent = TRUE)) :
long vectors not supported yet: memory.c:3100
引起问题的函数如下:
StationUserX <- function(userNDX){
lat1 = deg2rad(geolocation$latitude[userNDX])
long1 = deg2rad(geolocation$longitude[userNDX])
session_user_id = as.character(geolocation$session_user_id[userNDX])
#Find closest station
Distance2Stations <- unlist(lapply(stationNDXs, Distance2StationX, lat1, long1))
# Return index for closest station and distance to closest station
stations_userX = data.frame(session_user_id = session_user_id,
station = ghcndstations$ID[stationNDXs],
Distance2Station = Distance2Stations)
stations_userX = stations_userX[with(stations_userX, order(Distance2Station)), ]
stations_userX = stations_userX[1:100,] #only the 100 closest stations...
row.names(stations_userX)<-NULL
return(stations_userX)
}
我使用mclapply运行这个函数50k次。StationUserX调用Distance2StationX 90k次。
是否有明显的方法来优化函数StationUserX?
Vectorize
或者compiler
包中的cmpfun
来看是否能够提供简单的加速吗? - Gary Weissmanforeach
进行并行化,这是非常容易实现的。 - Gary Weissman