如何检查一个字符串是否表示一个数字（浮点数或整数）？

Question

如何检查一个字符串是否表示一个数字（浮点数或整数）？

pythoncastingfloating-pointtype-conversion

1961

如何在 Python 中检查字符串是否表示数字值？

def is_number(s):
    try:
        float(s)
        return True
    except ValueError:
        return False

以上方法可以运行，但看起来有些笨重。

_{如果你要测试的内容来自用户输入，即使它代表一个int或float，它仍然是一个字符串。如何将输入转换为数字？了解如何将输入转换，并要求用户在继续之前输入表示为int或float（或其他要求）的有效响应。}

- Daniel Goldberg

98

你现有的解决方案有什么问题吗？它很简短、快速且易读。 - Colonel Panic

5

你不仅仅可以返回True或False，你还可以适当修改返回的值。例如，你可以将非数字用引号括起来。 - Thruston

8

如果成功转换，返回float(s)的结果岂不更好？你仍然需要检查是否成功（结果为False），而且你已经完成了转换，这也是你可能想要的。 - Jiminion

10

尽管这个问题比较旧，但我想说这种方法被称为EAFP，是一种优雅的方式。因此，这可能是解决这种问题的最佳方案。 - thiruvenkadam

9

如果转换失败，不要返回float(s)的结果或None。如果你这样使用它x = float('0.00'); if x: use_float(x);，你的代码中现在有一个bug。这些函数抛出异常而不是一开始返回None的原因是为了避免Truthy值。更好的解决方案是在需要使用时避免实用程序函数，而是在调用float时使用try catch语句块捕获异常。 - ovangle

显示剩余16条评论

41个回答

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Aruthawolf · Answer 1

我知道这已经很久远了，但我想补充一个答案，我认为它涵盖了最高票答案中缺失的信息，对于任何发现这个问题非常有价值：

对于以下每种方法，请将它们与计数器连接起来，以便接受任何输入。（假设我们使用整数的语音定义，而不是0-255等）

x.isdigit（） 适用于检查 x 是否为整数。

x.replace('-','').isdigit() 适用于检查 x 是否为负数。（检查第一个位置是否为 - ）

x.replace('.','').isdigit() 适用于检查 x 是否为十进制数。

x.replace(':','').isdigit() 适用于检查 x 是否为比率。

x.replace('/','',1).isdigit() 适用于检查 x 是否为分数。

- Evan Plaice · Answer 2

只是模仿C#

在C＃中，有两个不同的函数来处理标量值的解析：

Float.Parse()
Float.TryParse()

float.parse()：

def parse(string):
    try:
        return float(string)
    except Exception:
        throw TypeError

注意：如果你想知道为什么我将异常更改为 TypeError，请参考这里的文档。

float.try_parse()：

def try_parse(string, fail=None):
    try:
        return float(string)
    except Exception:
        return fail;

注意: 不要返回布尔值 'False'，因为它仍然是一个值类型。使用 'None' 更好，因为它表示失败。当然，如果你想要其他不同的东西，可以将 'fail' 参数更改为任何你想要的值。

要扩展浮点数以包括 'parse（）' 和 'try_parse（）'，您需要猴子补丁 'float' 类以添加这些方法。

如果你想尊重现有的函数，代码应该像这样:

def monkey_patch():
    if(!hasattr(float, 'parse')):
        float.parse = parse
    if(!hasattr(float, 'try_parse')):
        float.try_parse = try_parse

顺便说一句：我个人更喜欢称之为“猴子拳打”，因为这样做感觉像是在对语言进行虐待，但你的想法可能不同。

用法：

float.parse('giggity') // throws TypeException
float.parse('54.3') // returns the scalar value 54.3
float.tryParse('twank') // returns None
float.tryParse('32.2') // returns the scalar value 32.2

伟大的贤者Pythonas对圣坛Sharpisus说道，“你做得到的事情，我都可以做得更好；我比你更擅长做所有事情。”

- codelogic · Answer 3

将数据类型转换为浮点型并捕获ValueError异常可能是最快的方法，因为float()专门用于此操作。使用其他需要字符串解析的方法（正则表达式等）可能会更慢，因为它们并未针对此操作进行优化。我的意见。

- Blackzafiro · Answer 4

您可以使用Unicode字符串，它们有一个方法可以做到您想要的：

>>> s = u"345"
>>> s.isnumeric()
True

或者：

>>> s = "345"
>>> u = unicode(s)
>>> u.isnumeric()
True

http://www.tutorialspoint.com/python/string_isnumeric.htm

http://docs.python.org/2/howto/unicode.html

- user10461621 · Answer 5

以下是可能的输入：

a="50" b=50 c=50.1 d="50.1"

1-通用输入：

此函数的输入可以是任何内容！

判断给定变量是否为数值。数值字符串由可选符号、任意数量的数字、可选小数部分和可选指数部分组成。因此，+0123.45e6 是一个有效的数值。十六进制（例如 0xf4c3b00c）和二进制（例如 0b10100111001）表示法是不被允许的。

is_numeric 函数

import ast
import numbers              
def is_numeric(obj):
    if isinstance(obj, numbers.Number):
        return True
    elif isinstance(obj, str):
        nodes = list(ast.walk(ast.parse(obj)))[1:]
        if not isinstance(nodes[0], ast.Expr):
            return False
        if not isinstance(nodes[-1], ast.Num):
            return False
        nodes = nodes[1:-1]
        for i in range(len(nodes)):
            #if used + or - in digit :
            if i % 2 == 0:
                if not isinstance(nodes[i], ast.UnaryOp):
                    return False
            else:
                if not isinstance(nodes[i], (ast.USub, ast.UAdd)):
                    return False
        return True
    else:
        return False

测试：

>>> is_numeric("54")
True
>>> is_numeric("54.545")
True
>>> is_numeric("0x45")
True

is_float函数

判断给定变量是否为浮点数。浮点字符串由可选符号、任意数量的数字、小数点和分数组成。

import ast

def is_float(obj):
    if isinstance(obj, float):
        return True
    if isinstance(obj, int):
        return False
    elif isinstance(obj, str):
        nodes = list(ast.walk(ast.parse(obj)))[1:]
        if not isinstance(nodes[0], ast.Expr):
            return False
        if not isinstance(nodes[-1], ast.Num):
            return False
        if not isinstance(nodes[-1].n, float):
            return False
        nodes = nodes[1:-1]
        for i in range(len(nodes)):
            if i % 2 == 0:
                if not isinstance(nodes[i], ast.UnaryOp):
                    return False
            else:
                if not isinstance(nodes[i], (ast.USub, ast.UAdd)):
                    return False
        return True
    else:
        return False

测试：

>>> is_float("5.4")
True
>>> is_float("5")
False
>>> is_float(5)
False
>>> is_float("5")
False
>>> is_float("+5.4")
True

什么是ast？

2- 如果您有信心变量内容是字符串(String)：

使用str.isdigit()方法。

>>> a=454
>>> a.isdigit()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'int' object has no attribute 'isdigit'
>>> a="454"
>>> a.isdigit()
True

3-数字输入:

检测整数值:

>>> isinstance("54", int)
False
>>> isinstance(54, int)
True
>>>

检测浮点数：

>>> isinstance("45.1", float)
False
>>> isinstance(45.1, float)
True

- Ron Reiter · Answer 6

我想看看哪种方法最快。总的来说，check_replace 函数提供了最佳和最一致的结果。如果没有引发异常，则 check_exception 函数提供了最快的结果，这意味着它的代码是最高效的，但抛出异常的开销相当大。

请注意，检查成功转换是唯一准确的方法，例如，使用 check_exception 可以工作，但其他两个测试函数对于有效的浮点数将返回 False。

huge_number = float('1e+100')

这是基准测试代码：

import time, re, random, string

ITERATIONS = 10000000

class Timer:    
    def __enter__(self):
        self.start = time.clock()
        return self
    def __exit__(self, *args):
        self.end = time.clock()
        self.interval = self.end - self.start

def check_regexp(x):
    return re.compile("^\d*\.?\d*$").match(x) is not None

def check_replace(x):
    return x.replace('.','',1).isdigit()

def check_exception(s):
    try:
        float(s)
        return True
    except ValueError:
        return False

to_check = [check_regexp, check_replace, check_exception]

print('preparing data...')
good_numbers = [
    str(random.random() / random.random()) 
    for x in range(ITERATIONS)]

bad_numbers = ['.' + x for x in good_numbers]

strings = [
    ''.join(random.choice(string.ascii_uppercase + string.digits) for _ in range(random.randint(1,10)))
    for x in range(ITERATIONS)]

print('running test...')
for func in to_check:
    with Timer() as t:
        for x in good_numbers:
            res = func(x)
    print('%s with good floats: %s' % (func.__name__, t.interval))
    with Timer() as t:
        for x in bad_numbers:
            res = func(x)
    print('%s with bad floats: %s' % (func.__name__, t.interval))
    with Timer() as t:
        for x in strings:
            res = func(x)
    print('%s with strings: %s' % (func.__name__, t.interval))

以下是在2017款MacBook Pro 13上使用Python 2.7.10得到的结果：

check_regexp with good floats: 12.688639
check_regexp with bad floats: 11.624862
check_regexp with strings: 11.349414
check_replace with good floats: 4.419841
check_replace with bad floats: 4.294909
check_replace with strings: 4.086358
check_exception with good floats: 3.276668
check_exception with bad floats: 13.843092
check_exception with strings: 15.786169

以下是在2017年MacBook Pro 13上使用Python 3.6.5的结果：

check_regexp with good floats: 13.472906000000009
check_regexp with bad floats: 12.977665000000016
check_regexp with strings: 12.417542999999995
check_replace with good floats: 6.011045999999993
check_replace with bad floats: 4.849356
check_replace with strings: 4.282754000000011
check_exception with good floats: 6.039081999999979
check_exception with bad floats: 9.322753000000006
check_exception with strings: 9.952595000000002

以下是在2017年MacBook Pro 13上使用PyPy 2.7.13的结果：

check_regexp with good floats: 2.693217
check_regexp with bad floats: 2.744819
check_regexp with strings: 2.532414
check_replace with good floats: 0.604367
check_replace with bad floats: 0.538169
check_replace with strings: 0.598664
check_exception with good floats: 1.944103
check_exception with bad floats: 2.449182
check_exception with strings: 2.200056

- Siddharth Satpathy · Answer 7

在大多数情况下，对于浮点数，我们需要考虑整数和小数。以字符串"1.1"为例。

我会尝试以下方法之一： 1.> isnumeric()

word = "1.1"

"".join(word.split(".")).isnumeric()
>>> True

2.> isdigit（）

word = "1.1"

"".join(word.split(".")).isdigit()
>>> True

3.> isdecimal()

(注意：已保留HTML标签，不进行翻译)

word = "1.1"

"".join(word.split(".")).isdecimal()
>>> True

速度：

► 所有上述方法的速度相似。

%timeit "".join(word.split(".")).isnumeric()
>>> 257 ns ± 12 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

%timeit "".join(word.split(".")).isdigit()
>>> 252 ns ± 11 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

%timeit "".join(word.split(".")).isdecimal()
>>> 244 ns ± 7.17 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

- a1an · Answer 8

综上所述，检查NaN、无穷大和复数（看起来它们是用j而不是i指定的，即1+2j），结果为：

def is_number(s):
    try:
        n=str(float(s))
        if n == "nan" or n=="inf" or n=="-inf" : return False
    except ValueError:
        try:
            complex(s) # for complex
        except ValueError:
            return False
    return True

- zardosht · Answer 9

str.isnumeric()

如果字符串中的所有字符都是数字字符且至少有一个字符，则返回 True，否则返回 False。数字字符包括数字字符和所有具有 Unicode 数字值属性的字符，例如 U+2155，分数一／五。在形式上，具有 Numeric_Type=Digit、Numeric_Type=Decimal 或 Numeric_Type=Numeric 属性值的字符是数字字符。

str.isdecimal()

如果字符串中的所有字符都是十进制字符且至少有一个字符，则返回 True，否则返回 False。十进制字符是指可用于组成十进制数的字符，例如 U+0660，印度阿拉伯数字零。在形式上，十进制字符是 Unicode 通用类别“Nd”中的字符。

两者适用于 Python 3.0 及以上版本的字符串类型。

- David Ljung Madison Stellar · Answer 10

我认为你的解决方案不错，但确实存在正确的正则表达式实现。

这些答案对正则表达式的厌恶似乎有点不公平，正则表达式可以相当清晰、正确和快速。这取决于你想要做什么。原问题是如何“检查一个字符串是否可以表示为数字（浮点数）”（与你的标题相符）。假定在检查过它是否有效之后你需要使用数字/浮点值，则你的 try/except 很有道理。但是，如果由于某种原因，你只想验证一个字符串是否为数字，那么使用正则表达式也可以，但很难正确地使用。例如，我认为迄今为止大多数正则表达式答案并不能正确解析没有整数部分的字符串（比如“.7”），而Python认为它是一个浮点数。在一个单一的正则表达式中检查此类问题有一点棘手，其中小数部分不是必需的。我已经包含了两个正则表达式来展示这一点。

它确实引出了一个有趣的问题，即“数字”是什么。您是否包括在Python中有效的浮点数“inf”？或者您是否包括一些“数字”，但可能无法在Python中表示（例如大于浮点数最大值的数字）。

解析数字时也存在歧义。例如，“--20”怎么办？这是一个“数字”吗？这是表示“20”的合法方式吗？Python允许您执行“var = --20”并将其设置为20（尽管实际上是因为它将其视为表达式），但float（“--20”）无法工作。

总之，在没有更多信息的情况下，这是一个正则表达式，我相信它涵盖了所有整数和浮点数，就像Python解析它们一样。

# Doesn't properly handle floats missing the integer part, such as ".7"
SIMPLE_FLOAT_REGEXP = re.compile(r'^[-+]?[0-9]+\.?[0-9]+([eE][-+]?[0-9]+)?$')
# Example "-12.34E+56"      # sign (-)
                            #     integer (12)
                            #           mantissa (34)
                            #                    exponent (E+56)

# Should handle all floats
FLOAT_REGEXP = re.compile(r'^[-+]?([0-9]+|[0-9]*\.[0-9]+)([eE][-+]?[0-9]+)?$')
# Example "-12.34E+56"      # sign (-)
                            #     integer (12)
                            #           OR
                            #             int/mantissa (12.34)
                            #                            exponent (E+56)

def is_float(str):
  return True if FLOAT_REGEXP.match(str) else False

一些示例测试值：

True  <- +42
True  <- +42.42
False <- +42.42.22
True  <- +42.42e22
True  <- +42.42E-22
False <- +42.42e-22.8
True  <- .42
False <- 42nope

在@ron-reiter的回答中运行基准测试代码表明，这个正则表达式实际上比普通正则表达式更快，并且更擅长处理不良值，这是有道理的。结果：

check_regexp with good floats: 18.001921
check_regexp with bad floats: 17.861423
check_regexp with strings: 17.558862
check_correct_regexp with good floats: 11.04428
check_correct_regexp with bad floats: 8.71211
check_correct_regexp with strings: 8.144161
check_replace with good floats: 6.020597
check_replace with bad floats: 5.343049
check_replace with strings: 5.091642
check_exception with good floats: 5.201605
check_exception with bad floats: 23.921864
check_exception with strings: 23.755481