使用separate
可以传递正则表达式前后环视。 在这种情况下,它将匹配在4位数字之前的-
或在4位数字之后的-
library(tidyr)
separate(df1, pub_author, into = c('website_title','year', 'author'),
"-(?=\\d{4})|(?<=\\d{4})-")
# website_title year author
#1 nfl-draft-geek 2018 justin-miller
#2 cbs 2019 pete-prisco
#3 sb-nation 2020 dan-kadar
#4 football-fan-spot 2019 steven-lourie
#5 fanspeak 2018 william
#6 acme-packing-company 2020 shawn-wagner
df1 <- structure(list(pub_author = c("nfl-draft-geek-2018-justin-miller",
"cbs-2019-pete-prisco", "sb-nation-2020-dan-kadar",
"football-fan-spot-2019-steven-lourie",
"fanspeak-2018-william", "acme-packing-company-2020-shawn-wagner"
)), class = "data.frame", row.names = c(NA, -6L))
extract(df, pub_author, into = c('number', 'first name', 'last name', 'position', 'school'), "^(\\d+)(\\w+)\\s+([A-Z][a-z]+)([A-Z]{2})\\s+\\|\\s+(\\w+)")
。 - akrundput
,因为其他人无法从图像中复制,并且您还可以避免潜在的负评 :=) - akrun