看起来grep在返回匹配项时具有“贪婪”的特点。假设我有以下数据:
Sources <- c(
"Coal burning plant",
"General plant",
"coalescent plantation",
"Charcoal burning plant"
)
Registry <- seq(from = 1100, to = 1103, by = 1)
df <- data.frame(Registry, Sources)
如果我执行
grep("(?=.*[Pp]lant)(?=.*[Cc]oal)", df$Sources, perl = TRUE, value = TRUE)
,它将返回:"Coal burning plant"
"coalescent plantation"
"Charcoal burning plant"
然而,我只想返回精确匹配,即只有在“coal”和“plant”同时出现的情况下。我不想要“coalescent”,“plantation”等等。因此,为了实现这一点,我只想看到"Coal burning plant"
。
==
,然后如果您想要精确匹配:df$Sources[df$Sources == "Coal burning plant"]
。 - thelatemail