如果栅格值为 NA,则搜索并提取最近的非 NA 像素。

9

当我将栅格的值提取到点时,我发现有几个 NA,而我不想使用extract函数中的bufferfun参数,相反,我想提取最近的非NA像素到与NA重叠的点。

我正在使用基本的extract函数:

data.extr<-extract(loc.thr, data[,11:10])
5个回答

8

以下是不使用缓存的解决方案。然而,它会为数据集中的每个点单独计算距离地图,因此如果您的数据集很大,则可能效率低下。

set.seed(2)

# create a 10x10 raster
r <- raster(ncol=10,nrow=10, xmn=0, xmx=10, ymn=0,ymx=10)
r[] <- 1:10
r[sample(1:ncell(r), size = 25)] <- NA

# plot the raster
plot(r, axes=F, box=F)
segments(x0 = 0, y0 = 0:10, x1 = 10, y1 = 0:10, lty=2)
segments(y0 = 0, x0 = 0:10, y1 = 10, x1 = 0:10, lty=2)

# create sample points and add them to the plot
xy = data.frame(x=runif(10,1,10), y=runif(10,1,10))
points(xy, pch=3)
text(x = xy$x, y = xy$y, labels = as.character(1:nrow(xy)), pos=4, cex=0.7, xpd=NA)

# use normal extract function to show that NAs are extracted for some points
extracted = extract(x = r, y = xy)

# then take the raster value with lowest distance to point AND non-NA value in the raster
sampled = apply(X = xy, MARGIN = 1, FUN = function(xy) r@data@values[which.min(replace(distanceFromPoints(r, xy), is.na(r), NA))])

# show output of both procedures
print(data.frame(xy, extracted, sampled))

#          x        y extracted sampled
#1  5.398959 6.644767         6       6
#2  2.343222 8.599861        NA       3
#3  4.213563 3.563835         5       5
#4  9.663796 7.005031        10      10
#5  2.191348 2.354228        NA       2
#6  1.093731 9.835551         2       2
#7  2.481780 3.673097         3       3
#8  8.291729 2.035757         9       9
#9  8.819749 2.468808         9       9
#10 5.628536 9.496376         6       6

您的解决方案已成功处理了我的光栅任务,而该任务并不是很大。 - Joe
谢谢,koekenbakker,非常方便。 - J. Win.
哦,但如果你不在内存中...那么这个就不再起作用了。你有什么想法? - jebyrnes
不错的解决方案,但对于大量点(30k)会导致R冻结。 - Herman Toothrot

4
这是一种基于栅格的解决方案,首先使用最近的非 NA 像素值填充 NA 像素。但请注意,这不考虑点在像素内的位置。相反,它计算像素中心之间的距离,以确定最近的非 NA 像素。
首先,它为每个 NA 栅格像素计算到最近的非 NA 像素的距离和方向。下一步是计算此非 NA 单元的坐标(假定投影 CRS),提取其值并将该值存储在 NA 位置。
起始数据:一个投影栅格,与 koekenbakker 的 答案 中的值相同:
set.seed(2)
# set projected CRS
r <- raster(ncol=10,nrow=10, xmn=0, xmx=10, ymn=0,ymx=10, crs='+proj=utm +zone=1') 
r[] <- 1:10
r[sample(1:ncell(r), size = 25)] <- NA

# create sample points
xy = data.frame(x=runif(10,1,10), y=runif(10,1,10))

# use normal extract function to show that NAs are extracted for some points
extracted <- raster::extract(x = r, y = xy)

计算所有NA像素到最近的非NA像素的距离和方向:
dist <- distance(r)  
# you can also set a maximum distance: dist[dist > maxdist] <- NA
direct <- direction(r, from=FALSE)

检索NA像素的坐标

# NA raster
rna <- is.na(r) # returns NA raster

# store coordinates in new raster: https://dev59.com/4JTfa4cB1Zd3GeqPP2Gd#35592230 
na.x <- init(rna, 'x')
na.y <- init(rna, 'y')

# calculate coordinates of the nearest Non-NA pixel
# assume that we have a orthogonal, projected CRS, so we can use (Pythagorean) calculations
co.x <- na.x + dist * sin(direct)
co.y <- na.y + dist * cos(direct)

# matrix with point coordinates of nearest non-NA pixel
co <- cbind(co.x[], co.y[]) 

提取最近的非NA单元格的值,其坐标为'co'

# extract values of nearest non-NA cell with coordinates co
NAVals <- raster::extract(r, co, method='simple') 
r.NAVals <- rna # initiate new raster
r.NAVals[] <- NAVals # store values in raster

用新值填充原始光栅

# cover nearest non-NA value at NA locations of original raster
r.filled <- cover(x=r, y= r.NAVals)

sampled <- raster::extract(x = r.filled, y = xy)

# compare old and new values
print(data.frame(xy, extracted, sampled))

#           x        y extracted sampled
# 1  5.398959 6.644767         6       6
# 2  2.343222 8.599861        NA       3
# 3  4.213563 3.563835         5       5
# 4  9.663796 7.005031        10      10  
# 5  2.191348 2.354228        NA       3
# 6  1.093731 9.835551         2       2
# 7  2.481780 3.673097         3       3
# 8  8.291729 2.035757         9       9
# 9  8.819749 2.468808         9       9 
# 10 5.628536 9.496376         6       6

请注意,点5的值与Koekenbakker的answer不同,因为该方法未考虑点在像素内的位置(如上所述)。如果这很重要,则此解决方案可能不适合。在其他情况下,例如如果光栅单元格与点精度相比较小,则基于光栅的方法应该能够给出良好的结果。

很好的解释,我很高兴你澄清了这一点。 - Joe
也许这可以算作一个不同的问题,但是如果能够返回新的xy点并且对应的数据非NA,那将会很好。 - Joe

2

对于栅格堆栈,请使用@koekenbakker上面提供的解决方案,并将其转换为函数。栅格堆栈的@layers插槽是栅格列表,因此,可以在其中应用lapply并从那里开始。

#new layer
r2 <- raster(ncol=10,nrow=10, xmn=0, xmx=10, ymn=0,ymx=10)
r2[] <- 1:10
r2[sample(1:ncell(r2), size = 25)] <- NA

#make the stack
r_stack <- stack(r, r2)

#a function for sampling
sample_raster_NA <- function(r, xy){
  apply(X = xy, MARGIN = 1, 
        FUN = function(xy) r@data@values[which.min(replace(distanceFromPoints(r, xy), is.na(r), NA))])

}

#lapply to get answers
lapply(r_stack@layers, function(a_layer) sample_raster_NA(a_layer, xy))

或者为了更高端(提升速度?)

purrr::map(r_stack@layers, sample_raster_NA, xy=xy)

这让我想知道是否可以使用dplyr更快地完成整个事情...


0
在seegSDM包中的nearestLand函数可以很好地完成这个任务。
set.seed(2)

# create a 10x10 raster
r <- raster(ncol=10,nrow=10, xmn=0, xmx=10, ymn=0,ymx=10)
r[] <- 1:10
r[sample(1:ncell(r), size = 25)] <- NA
extracted = extract(x = r, y = xy)

# find points that fall in NA areas on the raster
plot(r)
segments(x0 = 0, y0 = 0:10, x1 = 10, y1 = 0:10, lty=2)
segments(y0 = 0, x0 = 0:10, y1 = 10, x1 = 0:10, lty=2)
outside_mask <- is.na(extracted)
outside_pts <- xy[outside_mask,]

points(xy, pch = 16)
text(x = xy$x, y = xy$y, labels = as.character(1:nrow(xy)), pos=4, cex=0.7, xpd=NA)

points(outside_pts, pch = 16, col = 'red')


# find the nearest cell
nearest.cells <- nearestLand(outside_pts, r, 90000)

# plot the new points (for those which were reassigned) in green
points(nearest.cells, pch = 16, col = ' green')

# plot where they moved to
arrows(outside_pts[, 1],
       outside_pts[, 2],
       nearest.cells[, 1],
       nearest.cells[, 2], length = 0.1)

#Extracting the raster values using the new coordinates:
extract(x = r, y = nearest.cells)

0

SEEG-Oxford/seegSDM

这个 GitHub 仓库有一个非常有用的函数,如下所示。它返回最近的非 NA 像素的坐标。

nearestLand <- function (points, raster, max_distance) {
  nearest <- function (lis, raster) {
    neighbours <- matrix(lis[[1]], ncol = 2)
    point <- lis[[2]]
    land <- !is.na(neighbours[, 2])
    if (!any(land)) {
      return (c(NA, NA))
    } else{
      coords <- xyFromCell(raster, neighbours[land, 1])   
      if (nrow(coords) == 1) {
        return (coords[1, ])
      }
      dists <- sqrt((coords[, 1] - point[1]) ^ 2 +
                      (coords[, 2] - point[2]) ^ 2)
      return (coords[which.min(dists), ])
    }
  }
  neighbour_list <- extract(raster, points,
                            buffer = max_distance,
                            cellnumbers = TRUE)
  neighbour_list <- lapply(1:nrow(points),
                           function(i) {
                             list(neighbours = neighbour_list[[i]],
                                  point = as.numeric(points[i, ]))
                           })
  return (t(sapply(neighbour_list, nearest, raster)))
}

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接