如何在R中根据元素长度对列表进行子集化

Question

如何在R中根据元素长度对列表进行子集化

4

在R中，我有一个函数（来自包sp的coordinates），它可以查找您提供的每个IP地址的11个数据字段。

我有一个名为ip.addresses的IP列表：

> head(ip.addresses)
[1] "128.177.90.11"  "71.179.12.143"  "66.31.55.111"   "98.204.243.187" "67.231.207.9"   "67.61.248.12"

注意：这些或其他任何IP都可用于重现此问题。

因此，我使用sapply将该函数应用于该对象：

ips.info     <- sapply(ip.addresses, ip2coordinates)

我得到了一个名为ips.info的列表作为结果。这很好，但是我不能仅凭列表做更多的事情，所以我需要将它转换成数据框。问题在于，并非所有IP地址都在数据库中，因此一些列表元素只有1个字段，我会得到如下错误：

> ips.df       <- as.data.frame(ips.info)
Error in data.frame(`128.177.90.10` = list(ip.address = "128.177.90.10",  :

参数意味着行数不同: 1，0

我的问题是-"如何删除具有丢失/不完整数据的元素或以其他方式将此列表转换为每个IP地址具有11列和1行的数据帧？"

我尝试过多种方法。

First, I tried to write a loop that removes elements with less than a length of 11

for (i in 1:length(ips.info)){
if (length(ips.info[i]) < 11){
ips.info[i] <- NULL}}

这会导致一些记录没有数据，而其他记录则显示“NULL”，但即使是带有“NULL”的记录也无法被is.null检测到。

Next, I tried the same thing with double square brackets and get
```
Error in ips.info[[i]] : subscript out of bounds
```

I also tried complete.cases() to see if it could potentially be useful

Error in complete.cases(ips.info) : not all arguments have the same length

Finally, I tried a variation of my for loop which was conditioned on length(ips.info[[i]] == 11 and wrote complete records to another object, but somehow it results in an exact copy of ips.info

- Hack-R

2个回答

7

基于base包的备选解决方案。

  # find non-complete elements
  ids.to.remove <- sapply(ips.info, function(i) length(i) < 11)
  # remove found elements
  ips.info <- ips.info[!ids.to.remove]
  # create data.frame
  df <- do.call(rbind, ips.info)

- DrDom

+1 使用内置函数。我已经在运行RDSTK，所以我还没有回去验证你的解决方案，但它看起来不错。 - Hack-R

我认为 “过滤器” 是解决您的任务更加优雅的方法。 - DrDom

来自RDSTK的唯一东西是ip2coordinates函数。我使用的其他所有内容都是基于R语言的。 - MrFlick

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- MrFlick · Accepted Answer

以下是一种使用内置的Filter函数实现此操作的方法

#input data
library(RDSTK)
ip.addresses<-c("128.177.90.10","71.179.13.143","66.31.55.111","98.204.243.188",
    "67.231.207.8","67.61.248.15")
ips.info  <- sapply(ip.addresses, ip2coordinates)

#data.frame creation
lengthIs <- function(n) function(x) length(x)==n
do.call(rbind, Filter(lengthIs(11), ips.info))

或者，如果您不想使用助手函数

do.call(rbind, Filter(function(x) length(x)==11, ips.info))