将字符串整数矩阵转换为整数计数数组

Question

将字符串整数矩阵转换为整数计数数组

3

我有一个字符串字符矩阵，其中包含逗号分隔的整数字符串：

> mat<-matrix(c(NA,"1",NA,"2,1","3","1,3,3"),nrow=2)
> mat
     [,1] [,2]  [,3]   
[1,] NA   NA    "3"    
[2,] "1"  "2,1" "1,3,3"

我希望得到一个数值数组的输出，其中z索引表示矩阵中整数的计数：

, , 1

     [,1] [,2] [,3]
[1,]   NA   NA   NA
[2,]   1    1    1 

, , 2

     [,1] [,2] [,3]
[1,]   NA   NA   NA
[2,]   NA   1    NA

, , 3

     [,1] [,2] [,3]
[1,]   NA   NA   1
[2,]   NA   NA   2

我该如何完成这个任务？

为了了解数据规模，最终的数组将具有约20,000 x 2,000 x 200的尺寸，矩阵将是数组的前两个维度（即20,000 x 2,000）。

- dlv

1个回答

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Roland · Accepted Answer

这个方法使用循环，可能不是最高效的解决方案：

mat<-matrix(c(NA,"1",NA,"2,1","3","1,3,3"),nrow=2)

#split the strings
temp <- strsplit(mat, ",", fixed=TRUE)

#unique values
levels <- na.omit(unique(do.call(c, temp)))

#convert to factors and use table
temp <- t(sapply(temp, function(x) table(factor(x, levels=levels))))

#make it an array
array(temp, c(nrow(mat), ncol(mat), length(levels)))
# , , 1
# 
#      [,1] [,2] [,3]
# [1,]    0    0    0
# [2,]    1    1    1
# 
# , , 2
# 
#      [,1] [,2] [,3]
# [1,]    0    0    0
# [2,]    0    1    0
# 
# , , 3
# 
#      [,1] [,2] [,3]
# [1,]    0    0    1
# [2,]    0    0    2

编辑：

这样可以避免在循环中应用table和factor，并且应该更快：

temp <- strsplit(mat, ",", fixed=TRUE)

id <- rep(seq_along(temp), sapply(temp, length))
temp <- factor(do.call(c, temp))
array(t(table(temp, id)), c(nrow(mat), ncol(mat), length(levels(temp))))