如何按照用户定义的方式(例如非字母顺序)对数据框进行排序

6

有一个数据框 dna

> dna
chrom   start
chr2    39482
chr1    203918
chr1    198282
chrX    7839028
chr17   3874

以下代码按字母升序和数字升序重新排列dna,其顺序由$chrom$start决定:
> dna <- dna[with(dna, order(chrom, start)), ]
> dna
chrom   start
chr1    198282
chr1    203918
chr17   3874
chr2    39482
chrX    7839028

然而,我希望能够按照以下方式对$chrom进行排序(为了方便起见,这里简化了示例):

chrom_order <- c("chr1","chr2", "chr17", "chrX")

我不能重命名任何东西,例如将chr1重命名为chr01

2个回答

10
你需要在factor中指定levels,然后使用索引的order
zz <- "chrom   start
chr2    39482
chr1    203918
chr1    198282
chrX    7839028
chr17   3874"
Data <- read.table(text=zz, header = TRUE)

library(Hmisc)
library(gdata)

Data$chrom  <- reorder.factor(Data$chrom , levels = c("chr1","chr2", "chr17", "chrX"))

Data[order(Data$chrom), ]
  chrom   start
2  chr1  203918
3  chr1  198282
1  chr2   39482
5 chr17    3874
4  chrX 7839028  

或者您可以使用这个:

> Data$chrom  <- factor(chrom , levels = c("chr1","chr2", "chr17", "chrX"))
> Data[order(Data$chrom), ]
  chrom   start
2  chr1  203918
3  chr1  198282
1  chr2   39482
5 chr17    3874
4  chrX 7839028

或者使用这个:

> Data$chrom <- reorder(Data$chrom, new.order=c("chr1","chr2", "chr17", "chrX"))
> Data[order(Data$chrom), ]

3

试试这个:

dna <- structure(list(chrom = structure(c(2L, 1L, 1L, 4L, 3L), .Label = c("chr1", 
"chr2", "chr17", "chrX"), class = c("ordered", "factor")), start = c(39482L, 
203918L, 198282L, 7839028L, 3874L)), .Names = c("chrom", "start"
), row.names = c(NA, -5L), class = "data.frame")

chrom_order <- c("chr1","chr2", "chr17", "chrX")

# Make chrom column ordered. Second term defines the order
dna$chrom <- ordered(dna$chrom, chrom_order)
dna[with(dna, order(chrom, start)),]

 chrom   start
3  chr1  198282
2  chr1  203918
1  chr2   39482
5 chr17    3874
4  chrX 7839028

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接