我最近在@agile bean的答案基础上构建了一个函数(使用rename_with
,以前是rename_at
),如果数据框中存在列名,则更改列名,这样当适用时,可以使异构数据框的列名相匹配。
循环肯定可以改进,但我想分享给后人。
创建示例数据框:
x= structure(list(observation_date = structure(c(18526L, 18784L,
17601L), class = c("IDate", "Date")), year = c(2020L, 2021L,
2018L)), sf_column = "geometry", agr = structure(c(id = NA_integer_,
common_name = NA_integer_, scientific_name = NA_integer_, observation_count = NA_integer_,
country = NA_integer_, country_code = NA_integer_, state = NA_integer_,
state_code = NA_integer_, county = NA_integer_, county_code = NA_integer_,
observation_date = NA_integer_, time_observations_started = NA_integer_,
observer_id = NA_integer_, sampling_event_identifier = NA_integer_,
protocol_type = NA_integer_, protocol_code = NA_integer_, duration_minutes = NA_integer_,
effort_distance_km = NA_integer_, effort_area_ha = NA_integer_,
number_observers = NA_integer_, all_species_reported = NA_integer_,
group_identifier = NA_integer_, year = NA_integer_, checklist_id = NA_integer_,
yday = NA_integer_), class = "factor", .Label = c("constant",
"aggregate", "identity")), row.names = c("3", "3.1", "3.2"), class = "data.frame")
函数
match_col_names <- function(x){
col_names <- list(date = c("observation_date", "date"),
C = c("observation_count", "count","routetotal"),
yday = c("dayofyear"),
latitude = c("lat"),
longitude = c("lon","long")
)
for(i in seq_along(col_names)){
newname=names(col_names)[i]
oldnames=col_names[[i]]
toreplace = names(x)[which(names(x) %in% oldnames)]
x <- x %>%
rename_with(~newname, toreplace)
}
return(x)
}
应用函数
x <- match_col_names(x)
gsubfn
的答案。也许 G.Grothendieck 会过来。他是正则表达式大师。 - IRTFM