geom_point如何删除包含缺失值的行?

8
我不确定为什么地图上没有显示任何数据点。
   Store_ID visits CRIND_CC  ISCC  EBITDAR top_bottom   Latitude  Longitude
      (int)  (int)    (int) (int)    (dbl)      (chr)     (fctr)     (fctr)
1        92    348    14819 39013 76449.15        top  41.731373  -93.58184
2      2035    289    15584 35961 72454.42        top  41.589428  -93.80785
3        50    266    14117 27262 49775.02        top  41.559017  -93.77287
4       156    266     7797 25095 28645.95        top    41.6143 -93.834404
5        66    234     8314 18718 46325.12        top    41.6002 -93.779236
6       207     18     2159 17999 20097.99     bottom  41.636208 -93.531876
7        59     23    10547 28806 52168.07     bottom   41.56153  -93.88083
8       101     23     1469 11611  7325.45     bottom   41.20982  -93.84298
9       130     26     2670 13561 14348.98     bottom  41.614517  -93.65789
10      130     26     2670 13561 14348.98     bottom 41.6145172  -93.65789
11       24     27    17916 41721 69991.10     bottom  41.597134  -93.49263

> dput(droplevels(top_bottom))
structure(list(Store_ID = c(92L, 2035L, 50L, 156L, 66L, 207L, 
59L, 101L, 130L, 130L, 24L), visits = c(348L, 289L, 266L, 266L, 
234L, 18L, 23L, 23L, 26L, 26L, 27L), CRIND_CC = c(14819L, 15584L, 
14117L, 7797L, 8314L, 2159L, 10547L, 1469L, 2670L, 2670L, 17916L
), ISCC = c(39013L, 35961L, 27262L, 25095L, 18718L, 17999L, 28806L, 
11611L, 13561L, 13561L, 41721L), EBITDAR = c(76449.15, 72454.42, 
49775.02, 28645.95, 46325.12, 20097.99, 52168.07, 7325.45, 14348.98, 
14348.98, 69991.1), top_bottom = c("top", "top", "top", "top", 
"top", "bottom", "bottom", "bottom", "bottom", "bottom", "bottom"
), Latitude = structure(c(11L, 4L, 2L, 7L, 6L, 10L, 3L, 1L, 8L, 
9L, 5L), .Label = c("41.20982", "41.559017", "41.56153", "41.589428", 
"41.597134", "41.6002", "41.6143", "41.614517", "41.6145172", 
"41.636208", "41.731373"), class = "factor"), Longitude = structure(c(3L, 
7L, 5L, 8L, 6L, 2L, 10L, 9L, 4L, 4L, 1L), .Label = c("-93.49263", 
"-93.531876", "-93.58184", "-93.65789", "-93.77287", "-93.779236", 
"-93.80785", "-93.834404", "-93.84298", "-93.88083"), class = "factor")), row.names = c(NA, 
-11L), .Names = c("Store_ID", "visits", "CRIND_CC", "ISCC", "EBITDAR", 
"top_bottom", "Latitude", "Longitude"), class = c("tbl_df", "tbl", 
"data.frame"))

创建图表:
map <- qmap('Des Moines') +
       geom_point(data = top_bottom, aes(x = as.numeric(Longitude),
                  y = as.numeric(Latitude)), colour = top_bottom, size = 3)

我收到了警告信息:

Removed 11 rows containing missing values (geom_point). 

不过,这可以在不使用 ggmap() 的情况下完成:

ggplot(top_bottom) +  
geom_point(aes(x = as.numeric(Longitude), y = as.numeric(Latitude)),
           colour = top_bottom, size = 3)

在这里输入图片描述

我如何使点覆盖在ggmap上?


1
我认为有些数据点超出了qmap对象的bbox范围。这就是为什么你会看到一些数据点被自动删除的原因,我想。 - jazzurro
这个问题可以得到更多的回复,如果进行了大量的改进。首先,请使用dput而不是直接粘贴数据。其次,请列出在你的代码中使用的软件包。 - alexwhitworth
这是真的。在geom_point行中从因子转换为数字时,它会更改值。有什么办法可以避免这种情况吗?否则我无法绘制因子,否则会出现错误“将离散变量分配给连续比例”。 - herkyonparade
话虽如此,我对ggmap不太熟悉,但我认为您可能需要ggplot对象。例如,map <- ggmap() + ggplot() + geom_point()会发生什么? - alexwhitworth
2个回答

8
您正在使用`as.numeric()`处理一个`factor`类型的变量。如这里所述,这将给出该变量的水平编号(而不是表示的数字)。不出所料,所有这些级别都不在“Des Moines”显示的画布上。
请使用`as.numeric(as.character(Latitude))`和`as.numeric(as.character(Longitude))`,尽管看起来很丑陋。

3

从样本数据来看,似乎有一个数据点不在地图区域内。

library(dplyr)
library(ggplot2)
library(ggmap)

### You can find lon/lat for bbox using your ggmap object.
### For instance, des1 <- ggmap(mymap1)
### str(des1)
### You could use bb2bbox() in the ggmap package to find lon/lat.

filter(top_bottom,
       between(Latitude, 41.27057, 41.92782),
       between(Longitude, -94.04787, -93.16897)) -> inside

setdiff(top_bottom, inside)

#  Store_ID visits CRIND_CC  ISCC EBITDAR top_bottom Latitude Longitude
#1      101     23     1469 11611 7325.45     bottom 41.20982 -93.84298

由于您使用了没有指定缩放级别的qmap()函数,我不知道您使用的缩放级别是多少。我们来稍微试一下。在第一个案例中,有一个数据点丢失了;已删除1行包含丢失值的数据(geom_point)。

mymap1 <- get_map('Des Moines', zoom = 10)

ggmap(mymap1) +
geom_point(data = top_bottom, aes(x = as.numeric(Longitude),
           y = as.numeric(Latitude)), colour = top_bottom, size = 3)

enter image description here

mymap2 <- get_map('Des Moines', zoom = 9)

ggmap(mymap2) +
geom_point(data = top_bottom, aes(x = as.numeric(Longitude),
           y = as.numeric(Latitude)), colour = top_bottom, size = 3)

图片描述

所以关键是,你要确保选择适合你的数据集的正确缩放级别。为此,您可能需要在qmap()中指定缩放级别。我希望这能帮助到您。

数据

top_bottom <- structure(list(Store_ID = c(92L, 2035L, 50L, 156L, 66L, 207L, 
59L, 101L, 130L, 130L, 24L), visits = c(348L, 289L, 266L, 266L, 
234L, 18L, 23L, 23L, 26L, 26L, 27L), CRIND_CC = c(14819L, 15584L, 
14117L, 7797L, 8314L, 2159L, 10547L, 1469L, 2670L, 2670L, 17916L
), ISCC = c(39013L, 35961L, 27262L, 25095L, 18718L, 17999L, 28806L, 
11611L, 13561L, 13561L, 41721L), EBITDAR = c(76449.15, 72454.42, 
49775.02, 28645.95, 46325.12, 20097.99, 52168.07, 7325.45, 14348.98, 
14348.98, 69991.1), top_bottom = structure(c(2L, 2L, 2L, 2L, 
2L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c("bottom", "top"), class = "factor"), 
Latitude = c(41.731373, 41.589428, 41.559017, 41.6143, 41.6002, 
41.636208, 41.56153, 41.20982, 41.614517, 41.6145172, 41.597134
), Longitude = c(-93.58184, -93.80785, -93.77287, -93.834404, 
-93.779236, -93.531876, -93.88083, -93.84298, -93.65789, 
-93.65789, -93.49263)), .Names = c("Store_ID", "visits", 
"CRIND_CC", "ISCC", "EBITDAR", "top_bottom", "Latitude", "Longitude"
), class = "data.frame", row.names = c("1", "2", "3", "4", "5", 
"6", "7", "8", "9", "10", "11"))

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接