如何提取某个单词前面的数字？

Question

如何提取某个单词前面的数字？

3

有一个句子是“我有5公斤的苹果和6公斤的梨。”

我只想提取苹果的重量。

所以我使用了：

sentence = "I have 5 kg apples and 6 kg pears"
number = re.findall(r'(\d+) kg apples', sentence)
print (number)

然而，这只适用于整数数字。那么如果我想要提取的数字是5.5呢？

- Kevin Guo

7个回答

0

? 在正则表达式中表示可选的匹配段。

re.findall(r'((\d+\.)?\d+)', sentence)

- mVChr

number = re.findall(r'((\d+\.)?\d+)', sentence) returns a list of tuples [('5', ''), ('6', '')] - Dudnikof

0

你可以使用 number = re.findall(r'(\d+\.?\d*) kg apples', sentence)

- Dudnikof

0

你需要更改你的正则表达式以匹配它：

(\d+(?:\.\d+)?)

\.\d+ 匹配一个点后面至少跟着一个数字。我将其设为可选，因为你仍然需要一个数字。

- Maroun

0

re.findall(r'[-+]?[0-9]*\.?[0-9]+.', sentence)

- Bobby

0

你需要的正则表达式应该长这样：

(\d+.?\d*) kg apples

您可以按照以下步骤操作：

number = re.findall(r'(\d+.?\d*) kg apples', sentence)

这里有一个在线示例

- Fabrizio

0

非正则表达式解决方案

sentence = "I have 5.5 kg apples and 6 kg pears"
words  = sentence.split(" ")

[words[idx-1] for idx, word in enumerate(words) if word == "kg"]
# => ['5.5', '6']

然后，您可以使用以下方法检查它们是否为有效的浮点数：

try:
   float(element)
except ValueError:
   print "Not a float"

- tihom

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Mohammad Yusuf · Accepted Answer

您可以尝试像这样做：

import re

sentence = ["I have 5.5 kg apples and 6 kg pears",
                   "I have 5 kg apples and 6 kg pears"]
for sen in sentence:
    print re.findall(r'(\d+(?:\.\d+)?) kg apples', sen)

输出：

['5.5']
['5']