Python Pandas：从文本中提取数字到新列

Question

Python Pandas：从文本中提取数字到新列

3

我在A列中有以下文本：

A   
hellothere_3.43  
hellothere_3.9

我想将只有数字的内容提取到另一个新列B（在A旁边），例如：

B                      
3.43   
3.9

我使用：str.extract('(\d.\d\d)', expand=True)，但是这只会复制3.43（即精确的数字）。是否有办法使它更通用？

非常感谢！

- MGs

2个回答

0

我认为字符串分割和应用lambda非常干净。

import pandas as pd

df = pd.DataFrame({"A": ["hellothere_3.43", "hellothere_3.9"]})
df["B"] = df['A'].str.split('_').apply(lambda x: float(x[1]))

我没有进行任何正式比较，但在小型测试中似乎比正则表达式解决方案更快。

- RickardSjogren

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Rakesh · Accepted Answer

使用正则表达式。

例子：

import pandas as pd

df = pd.DataFrame({"A": ["hellothere_3.43", "hellothere_3.9"]})
df["B"] = df["A"].str.extract("(\d*\.?\d+)", expand=True)
print(df)

输出：

                 A     B
0  hellothere_3.43  3.43
1   hellothere_3.9   3.9