我是一个关于R语言正则表达式的新手。我有一个向量,我希望从这个向量中提取每个字符串中第一次出现的数字。
我有一个名为"shootsummary"的向量,它看起来像这样:
> head(shootsummary)
[1] Aaron Alexis, 34, a military veteran and contractor from Texas, opened fire in the Navy installation, killing 12 people and wounding 8 before being shot dead by police.
[2] Pedro Vargas, 42, set fire to his apartment, killed six people in the complex, and held another two hostages at gunpoint before a SWAT team stormed the building and fatally shot him.
[3] John Zawahri, 23, armed with a homemade assault rifle and high-capacity magazines, killed his brother and father at home and then headed to Santa Monica College, where he was eventually killed by police.
[4] Dennis Clark III, 27, shot and killed his girlfriend in their shared apartment, and then shot two witnesses in the building's parking lot and a third victim in another apartment, before being killed by police.
[5] Kurt Myers, 64, shot six people in neighboring towns, killing two in a barbershop and two at a car care business, before being killed by officers in a shootout after a nearly 19-hour standoff.
第一个数字代表个人的“年龄”,我希望能够从这些字符串中提取年龄,而不混淆列表中其他数字。
我使用了:
as.numeric(gsub("\\D", "", shootsummary))
它的结果是:
[1] 34128 42 23 27 6419
我希望您能提供只提取句子中年龄信息的结果,而不提取年龄之后出现的其他数字。
[1] 34 42 23 27 64
NA
。 - akrun