我有一个 data.frame,其中每个基因名称都会重复出现,并包含 2 种条件的值:
df <- data.frame(gene=c("A","A","B","B","C","C"),
condition=c("control","treatment","control","treatment","control","treatment"),
count=c(10, 2, 5, 8, 5, 1),
sd=c(1, 0.2, 0.1, 2, 0.8, 0.1))
gene condition count sd
1 A control 10 1.0
2 A treatment 2 0.2
3 B control 5 0.1
4 B treatment 8 2.0
5 C control 5 0.8
6 C treatment 1 0.1
我希望计算出在处理后"count"是否增加或减少,并将其标记为这样的状态和/或对它们进行子集划分。即(伪代码):
for each unique(gene) do
if df[geneRow1,3]-df[geneRow2,3] > 0 then gene is "up"
else gene is "down"
这应该是最终的样子(最后一列可选):
up-regulated
gene condition count sd regulation
B control 5 0.1 up
B treatment 8 2.0 up
down-regulated
gene condition count sd regulation
A control 10 1.0 down
A treatment 2 0.2 down
C control 5 0.8 down
C treatment 1 0.1 down
我一直在苦思冥想,包括使用ddply进行尝试,但都未能找到解决方案 - 求助于一个无助的生物学家。
谢谢。