遍历数据框列名 - R

17
我试图循环遍历数据框的列名,并评估每个列所属的类别。
for (i in columns(df)){
  class(df$i)
}

除了正确的方法,我尝试了所有的方法...

顺便说一句:我在这样做是因为之后我需要为每个类设置不同的条件。


5
应用函数sapply()对数据框df中的每个变量执行函数class(),返回一个向量,其中包含每个变量的类别。 - Sathish
1
for (i in 1:length(df)){ class(df[,i]) } - Jorge
我不知道你后面想要做什么操作,但是你是否熟悉 dplyr::mutate_ifdplyr::summarise_if 函数集合? - hpesoj626
2个回答

40

要回答确切的问题并修复给定的代码,请参见以下示例

df <- iris # data

for (i in colnames(df)){
   print(class(df[[i]]))
}
# [1] "numeric"
# [1] "numeric"
# [1] "numeric"
# [1] "numeric"
# [1] "factor"
  1. 你需要使用colnames函数来获取df的列名。
  2. 如果你想知道每一列的类别,可以使用df[[i]]来访问每一列。而df[i]data.frame 类型。

2
是否可以从不同的列(例如第11列)开始循环,而不是第一列? - viridius

1
问题是循环遍历数据框的列,并且有一个额外的问题是如何循环遍历数据框的某个子集。我使用了mtcars数据集,因为它比iris数据集具有更多的数据列。这可以提供更丰富的示例。要遍历某些列的子集,请在for循环中使用数字值,而不是使用列名。如果感兴趣的列是定期间隔的,则使用包含感兴趣列的向量。以下是示例:
#Similar to previous answer only with mtcars rather than iris data.
df2<-mtcars
for (i in colnames(df2)){print(paste(i,"  ",class(df2[[i]])))}

#An alternative that is as simple but does not also print the variable names.
df2<-mtcars
for (i in 1:ncol(df2)){print(paste(i,"  ",class(df2[[i]])))}

#With variable names:
df2<-mtcars
for (i in 1:ncol(df2)){print(paste(i,"   ",colnames(df2[i]),"  ",class(df2[[i]])))}

#Now that we are looping numerically one can start in column 3 by:
df2<-mtcars
for (i in 3:ncol(df2)){print(paste(i,"   ",colnames(df2[i]),"  ",class(df2[[i]])))}

#To stop before the last column add a break statement inside an if
df2<-mtcars
for (i in 3:ncol(df2)){
  if(i>7){break}
  print(paste(i,"   ",colnames(df2[i]),"  ",class(df2[[i]])))}

#Finally, if you know the columns and they are irregularly spaced try this:
UseCols<-c(2,4,7,9,10)
for (i in UseCols){print(paste(i,"   ",colnames(df2[i]),"  ",class(df2[[i]])))}

请提供一些与此代码相关的解释。 - Christopher Moore

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接