I have a matrix of this format:
set.seed(1)
mat <- matrix(round(runif(25,0,1)),nrow=5,ncol=5)
colnames(mat) <- c("a1::C","a1::A","a1::B","b1::D","b1::A")
a1::C a1::A a1::B b1::D b1::A
[1,] 0 1 0 0 1
[2,] 0 1 0 1 0
[3,] 1 1 1 1 1
[4,] 1 1 0 0 0
[5,] 0 0 1 1 0
每一列都代表一个主题和特征(由双冒号隔开的列名表示)。每一行中,值为1表示该主题具有该特征,值为0表示没有。可能某个主题在特定行中所有列都是0。
我想构建一个新矩阵,其中列是主题(即每个主题一列),而行是按字母顺序排序并用逗号隔开的该主题具有的特征。如果某个主题没有任何特征(即该主题在某行的所有列中都是0),则应使用值“W”(没有任何特征的值为“W”)。
以下是基于
mat
的新矩阵的样子:cnames = unique(sapply(colnames(mat), function(x) strsplit(x,split="::")[[1]][1]))
new_mat <- matrix(c("A","A","A,B,C","A,C","B",
"A","D","A,D","W","D"),
nrow=nrow(mat),ncol=length(cnames))
colnames(new_mat) = cnames
a1 b1
[1,] "A" "A"
[2,] "A" "D"
[3,] "A,B,C" "A,D"
[4,] "A,C" "W"
[5,] "B" "D"
有没有什么高效且优雅的方法可以实现这个?