修改R因子?

7

假设你有一个在 R 中的 Data.Frame 对象,其中所有字符列都被转换为 factors。现在需要"修改"数据框中某一行的值——但仍然保留其作为 factor 的编码方式。首先需要提取单行数据,下面是一个可重现的示例:

a = c("ab", "ba", "ca")
b = c("ab", "dd", "da")
c = c("cd", "fa", "op")
data = data.frame(a,b,c, row.names = c("row1", "row2", "row3")
colnames(data) <- c("col1", "col2", "col3")
data[,"col1"] <- as.factor(data[,"col1"])
newdat <- data["row1",]
newdat["col1"] <- "ca"

当我将“ca”分配给newdat [“col1”]时,数据中与该列相关联的Factor对象被字符串“ca”覆盖。 这不是预期的行为。 相反,我想修改编码新数据中存在哪个级别的数值。 因此,我希望按如下方式更改newdat [“col1”]的内容:

之前:

Factor object, levels = c("ab", "ba", "ca"): 1 (the value it had)

之后:

Factor object, levels = c("ab", "ba", "ca"): 3 (the value associated with the level "ca")

我该如何完成这个任务?

你可以再次调用 factor 来包含新的水平,然后进行赋值。 - akrun
它已经包含了我想要更改的级别,但是上述语句会更改第15行的字段列,使其数据类型不再是因子,即使new_val是级别集的一部分。 - Quantumpencil
你尝试过 dataframe[15,'field'] <- new_val 吗?(没有可重现的示例未经测试) - akrun
4
请在您的问题中添加一个可重现的例子 - Richard Erickson
更新原问题并添加更多细节。 - Quantumpencil
显示剩余5条评论
1个回答

3
你正在做的事情相当于:
x = factor(letters[1:4]) #factor
x1 = x[1] #factor; subset of 'x'
x1 = "c" #assign new value

即,将一个新对象分配给现有符号。在您的示例中,只需将newdat ["col1"]的“factor”替换为“ca”即可。 但是,要对因子进行子分配(使用非级别进行子分配会导致NA),您可以使用

x = factor(letters[1:4])
x1 = x[1]
x1[1] = "c"  #factor; subset of 'x' with the 3rd level

在您的示例中(我使用local避免反复更改下面的newdat):

str(newdat)
#'data.frame':   1 obs. of  3 variables:
# $ col1: Factor w/ 3 levels "ab","ba","ca": 1
# $ col2: Factor w/ 3 levels "ab","da","dd": 1
# $ col3: Factor w/ 3 levels "cd","fa","op": 1
local({ newdat["col1"] = "ca"; str(newdat) })
#'data.frame':   1 obs. of  3 variables:
# $ col1: chr "ca"
# $ col2: Factor w/ 3 levels "ab","da","dd": 1
# $ col3: Factor w/ 3 levels "cd","fa","op": 1    
local({ newdat[1, "col1"] = "ca"; str(newdat) })
#'data.frame':   1 obs. of  3 variables:
# $ col1: Factor w/ 3 levels "ab","ba","ca": 3
# $ col2: Factor w/ 3 levels "ab","da","dd": 1
# $ col3: Factor w/ 3 levels "cd","fa","op": 1
local({ newdat[["col1"]][1] = "ca"; str(newdat) })
#'data.frame':   1 obs. of  3 variables:
# $ col1: Factor w/ 3 levels "ab","ba","ca": 3
# $ col2: Factor w/ 3 levels "ab","da","dd": 1
# $ col3: Factor w/ 3 levels "cd","fa","op": 1

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接