Python,确定字符串应该转换为Int还是Float

25

我希望将一个字符串转换为尽可能紧凑的数据类型:整型或浮点型。

我有两个字符串:

value1="0.80"     #this needs to be a float
value2="1.00"     #this needs to be an integer.

在Python中,我该如何确定value1应该是浮点数(Float),而value2应该是整数(Integer)?

10个回答

33
def isfloat(x):
    try:
        a = float(x)
    except (TypeError, ValueError):
        return False
    else:
        return True

def isint(x):
    try:
        a = float(x)
        b = int(a)
    except (TypeError, ValueError):
        return False
    else:
        return a == b

3
当然,在测试 isfloat() 之前应该先测试 isint(),只有在前者返回 False 时才进行后者的测试。 - Tim Pietzcker
首先尝试使用isint(),然后使用isfloat(),但如果使用isint(),所有测试都会通过。您认为仅使用isint()更好吗? - ManuParra
2
这对于非常大的数字是错误的。例如,float("10000000000000000.5") 是 1e+16,int(1e+16) 是 1e+16,但该数字不是整数。 - Nicolas
@Alexandr 没错,但是5.0(或者问题中的1.00)也会失败。我没有看到一种简单的方法可以接受5.0,同时拒绝5e0 - glglgl
@Navaro 5e-2 有小数点吗?没有。它是浮点数吗?是的。它是整数吗?不是。 - glglgl
显示剩余5条评论

26

Python float 对象有一个 is_integer 方法:

from ast import literal_eval
def parses_to_integer(s):
    val = literal_eval(s)
    return isinstance(val, int) or (isinstance(val, float) and val.is_integer())

我更喜欢这种方法,因为它不依赖于异常作为评估的一部分。 - DaveL17
如果 s 不能评估为有效的文字,它将抛出一个错误。另外 'False' 将作为整数返回,这可能不是您想要的。我知道你正在直接回答用户的问题,所以你的解决方案是有效的,但任何老鼠夹都可以改进。 - shrewmouse
2
这会对像 120.0 这样的字符串产生误报。 - Kazu
@DaveL17 Python旨在高效处理异常,不像其他一些语言。这就是为什么它更喜欢EAFP原则的原因。 - Mark Ransom
@MarkRansom 好的,说得对。我并不是想争辩什么,但我仍然更喜欢这种方法,因为我认为它更符合 Python 的风格。如果我负责传递给 literal_eval 的内容,并且速度不是问题,我不认为会有什么问题。我也喜欢一些 PEP8 不支持的东西。 <微笑> - DaveL17
显示剩余3条评论

7

当我试图确定两个XML文档之间的差异时,我必须处理确保'1.0'被转换为'1'的情况。因此,我编写了这个函数来帮助我。我认为其他一些解决方案在涉及'True'或'False'这样的字符串字面量时会失败。无论如何,这个函数对我来说非常有效。我希望它也能帮到您。

from ast import literal_eval

def convertString(s):
    '''
    This function will try to convert a string literal to a number or a bool
    such that '1.0' and '1' will both return 1.

    The point of this is to ensure that '1.0' and '1' return as int(1) and that
    'False' and 'True' are returned as bools not numbers.

    This is useful for generating text that may contain numbers for diff
    purposes.  For example you may want to dump two XML documents to text files
    then do a diff.  In this case you would want <blah value='1.0'/> to match
    <blah value='1'/>.

    The solution for me is to convert the 1.0 to 1 so that diff doesn't see a
    difference.

    If s doesn't evaluate to a literal then s will simply be returned UNLESS the
    literal is a float with no fractional part.  (i.e. 1.0 will become 1)

    If s evaluates to float or a float literal (i.e. '1.1') then a float will be
    returned if and only if the float has no fractional part.

    if s evaluates as a valid literal then the literal will be returned. (e.g.
    '1' will become 1 and 'False' will become False)
    '''


    if isinstance(s, str):
        # It's a string.  Does it represnt a literal?
        #
        try:
            val = literal_eval(s)
        except:
            # s doesn't represnt any sort of literal so no conversion will be
            # done.
            #
            val = s
    else:
        # It's already something other than a string
        #
        val = s

    ##
    # Is the float actually an int? (i.e. is the float 1.0 ?)
    #
    if isinstance(val, float):
        if val.is_integer(): 
            return int(val)

        # It really is a float
        return val

    return val

这个函数的单元测试输出如下:
convertString("1")=1; we expect 1
convertString("1.0")=1; we expect 1
convertString("1.1")=1.1; we expect 1.1
convertString("010")=8; we expect 8
convertString("0xDEADBEEF")=3735928559; we expect 3735928559
convertString("hello")="hello"; we expect "hello"
convertString("false")="false"; we expect "false"
convertString("true")="true"; we expect "true"
convertString("False")=False; we expect False
convertString("True")=True; we expect True
convertString(sri.gui3.xmlSamples.test_convertString.A)=sri.gui3.xmlSamples.test_convertString.A; we expect sri.gui3.xmlSamples.test_convertString.A
convertString(<function B at 0x7fd9e2f27ed8>)=<function B at 0x7fd9e2f27ed8>; we expect <function B at 0x7fd9e2f27ed8>
convertString(1)=1; we expect 1
convertString(1.0)=1; we expect 1
convertString(1.1)=1.1; we expect 1.1
convertString(3735928559)=3735928559; we expect 3735928559
convertString(False)=False; we expect False
convertString(True)=True; we expect True

单元测试代码如下:
import unittest

# just  class for testing that the class gets returned unmolested.
#
class A: pass

# Just a function
#
def B(): pass

class Test(unittest.TestCase):


    def setUp(self):
        self.conversions = [
            # input      | expected
            ('1'         ,1         ),
            ('1.0'       ,1         ), # float with no fractional part
            ('1.1'       ,1.1       ),
            ('010'       ,8         ), # octal
            ('0xDEADBEEF',0xDEADBEEF), # hex
            ('hello'     ,'hello'   ),
            ('false'     ,'false'   ),
            ('true'      ,'true'    ),
            ('False'     ,False     ), # bool
            ('True'      ,True      ), # bool
            (A           ,A         ), # class
            (B           ,B         ), # function
            (1           ,1         ),
            (1.0         ,1         ), # float with no fractional part
            (1.1         ,1.1       ),
            (0xDEADBEEF  ,0xDEADBEEF),
            (False       ,False     ),
            (True        ,True      ),
        ]


    def testName(self):
        for s,expected in self.conversions:
            rval = convertString(s)
            print 'convertString({s})={rval}; we expect {expected}'.format(**locals())
            self.assertEqual(rval, expected)


if __name__ == "__main__":
    #import sys;sys.argv = ['', 'Test.testName']
    unittest.main()

6
def coerce(x):
    try:
        a = float(x)
        b = int(x)
        if a != b:
            return a
        else:
            return b
    except:
        raise ValueError("failed to coerce str to int or float")

2
这对于浮点数将失败。如果 x == 0.5,那么行 b = int(x) 将立即引发 ValueError 异常,并且 if a == b 条件将永远不会被执行。 - daveruinseverything

4
另一种方法是使用正则表达式(这将捕获所有情况):
def parse_str(num: str):
    """
    Parse a string that is expected to contain a number.
    :param num: str. the number in string.
    :return: float or int. Parsed num.
    """
    if not isinstance(num, str):  # check type
        raise TypeError('num should be a str. Got {}.'.format(type(num)))
    clean_num = num.replace(" ", "").upper()  # get rid of spaces & make it uppercase
    if clean_num in ('NAN', '+NAN', '-NAN', '+INF', 'INF', '-INF'):
        return float(clean_num)
    if re.compile('^-?\d+$').search(clean_num):
        return int(clean_num)
    if re.compile('^-?(\d*\.\d+)|(\d+\.\d*)$').search(clean_num):
        return float(clean_num)
    raise ValueError('num is not a number. Got {}.'.format(num))

正则表达式模式注释

^      beginning of string
$      end of string
-?     match zero or one instance of "-" sign
\d+    one or many digits
\d*    none or many digits
\.     literal dot
|      or

测试(注意插入的空格)
print(parse_str('1'))
print(parse_str('-999'))
print(parse_str('1   .   2'))
print(parse_str('.3'))
print(parse_str('4.'))
print(parse_str('- 1  2.3  4'))
print(parse_str('    0.5    '))
print(parse_str('nan'))
print(parse_str('-nan'))
print(parse_str('inf'))
print(parse_str('-inf'))
print(parse_str('+nan'))
print(parse_str('+inf'))
print(parse_str('X Y Z'))  # this should throw error

结果

1
-999
1.2
0.3
4.0
-12.34
0.5
nan
nan
inf
-inf
nan
inf
ValueError: num is not a number. Got XYZ.

1
当值为负数时,我发现了一个问题,这些表达式无法匹配它,因为它不匹配“-”。我在\s*后面添加了-?,以便可以匹配负号。 - witoong623
@witoong623 已添加。谢谢! - Yahya

1

一个简短函数的示例: 返回字符串的数字类型(浮点数或整数), 对于非数字字符串,返回 str 类型。

def numeric_type_of_string(string: str):
    if string.isnumeric():
        return int
    try:
        val = float(string)
        return int if val == int(val) else float
    except (TypeError, ValueError):
        return str

如果您直接想要转换后的值,只需修改返回值即可:
def string_to_numeric_if_possible(string: str):
    if string.isnumeric():
        return int(string)
    try:
        val = float(string)
        return int(val) if val == int(val) else val
    except (TypeError, ValueError):
        return string

1
请展示一些不同类型的输出。 - not2qubit
你缺少哪种类型? - Johannes

1
lineVal = ['1850', '-0.373', '-0.339', '-0.425']

lineVal2 = [ float(x) if re.search(r'\.',x) else int(x) for x in lineVal ]

LineVal2 output ==> [1850, -0.373, -0.339, -0.425]

我是新手,我试过了,对我来说似乎有效。


1
例如,我认为这对于“'1.00'”不起作用。 - art-solopov

0

这里有一个使用eval()的有趣解决方案。注意:在生产环境或任何可能接收用户输入的地方使用eval是非常危险的,不建议使用!请将其视为仅供学术研究之用的答案。

def get_string_type(x):
    if type(x) != str:
        raise ValueError('Input must be a string!')
    try:
        string_type = type(eval(x))
    except NameError:
        string_type = str
    return string_type

由于 Eval 将字符串视为原始代码,因此它适用于您可以输入到 repl 的任何类型。例如

>>> from decimal import Decimal
>>> my_test_string = 'Decimal(0.5)'
>>> type(my_test_string)
<class 'str'>
>>> get_string_type(my_test_string)
<class 'decimal.Decimal'>

0

这个简单的函数可以解决问题,你只需要使用“Solution”代码块。

intOrfloat.py:

import sys
def NumberType(argv):
    Number = argv[1]
    try:
        float(Number)   # Exception if not a number
        ################ Solution ################
        if '.' not in Number:
            return '%s is Integer'%(Number)
        if int(Number.split('.')[1]) > 0:
            return '%s is Float'%(Number)
        else:
            return '%s is Integer'%(Number)
        ##########################################
    except Exception as e:
        return '%s is Text...'%(Number)
if __name__ == '__main__':
    print(NumberType(sys.argv))

测试:

>python intOrfloat.py 0.80
0.80 is Float

>python intOrfloat.py 1.00
1.00 is Integer

>python intOrfloat.py 9999999999999999
9999999999999999 is Integer

所以,不需要担心整数的大小。

'.' not in Number              # number without decimal must be an integer
Number.split('.')              # split into [integer String, decimal String]
Number.split('.')[1]           # get decimal String
int(Number.split('.')[1])      # convert it into decimal Number
int(Number.split('.')[1]) > 0  # decimal Number > 0 = Float; Otherwise, Integer

1
请在您的回答中添加一些解释,而不仅仅是发布代码。 - kk.

0

我想到这个请求是将存储为字符串的数字转换为最紧凑的数据类型之一,即 <float><int>。以下函数满足这个请求(但不检查输入是否有效,即数字而不是字母字符)。

str_to_float_or_int() 将存储为字符串的数字转换为 <float> 或者 <int>。虽然所有整数都可以是浮点数,但在可能的情况下返回 <int>,以符合“转换为最紧凑的数据类型”的标准,例如:

  • 输入 = 整数
  • 输入 = 0 小数位数的特征值('1.0')
  • 输入 = 没有小数位数的特征值('1.')

此技术使用 str.isdecimal() 确定字符串是否为值(而非空值),并使用 <str>.split(".") 方法将字符串值(候选数字)解析为两个部分:

  1. 整数 = 小数点前面的数字
  2. 尾数 = 小数点后面的数字

内置的<str>.split(".")方法返回一个列表。在这种情况下,列表的格式如下:[整数,尾数]

注意:从技术上讲,这里使用术语“整数”实际上是指“特征”。我使用“整数”是因为它含有较少的字符,因此更容易在编码中使用。

def str_to_float_or_int(value_str, ShowExtended=False):
    # Convert a number stored as a string to a <float> or an <int> 
    # whichever is the "tightest" data type. 

    # Default condition is that the number is a <float>
    isfloat = True
    value = float(value_str)
    numberParsed = value_str.split(".")
    if len(numberParsed) > 1:
        integer = numberParsed[0]
        mantissa = numberParsed[1]
        if integer.isdecimal() and mantissa.isdecimal():
            if int(mantissa) == 0:
                # value is an integer; mantissa is 0
                isfloat = False
                value = int(integer)
            elif integer.isdecimal():
                # value is an integer because a value is only
                # returned for 'integer' variable by .split(), 
                # the returned mantissa value is null.
                isfloat = False
                value = int(integer)
        else:
            # value is an integer because .split() returned
            # a single value list.
            isfloat = False
            value = int(value_str)
        if ShowExtended:
            print("testValue: " + value_str + " | splits into: ", 
                   numberParsed,"\n value: ", value)
            if isfloat:
                print("It's a <float> (;o)\n")
            else:
                print("It's an <int> {:o)~\n")
        return value

从控制台运行脚本以测试str_to_float_or_int()


testValues = ["0.80", "1.00", "5", ".1", "4."]
print("\n-----------------------------------------------\n" +
        "| Testcase:  ", testValues, " |\n" +
        "-----------------------------------------------")
for number in testValues:
    str_to_float_or_int(number, ShowExtended=True)

输出结果(从控制台复制)
>   ---------------------------------------------------
>   |  Testcase:   ['0.80', '1.00', '5', '.1', '4.']  |
>   ---------------------------------------------------
>   testValue: 0.80 | splits into:  ['0', '80']
>   value:  0.8
>   It's a <float> (;o)
>   
>   testValue: 1.00 | splits into:  ['1', '00']
>   value:  1
>   It's an <int> {:o)~
>   
>   testValue: 5 | splits into:  ['5']
>   value:  5
>   It's an <int> {:o)~
>   
>   testValue: .1 | splits into:  ['', '1']
>   value:  0.1
>   It's a <float> (;o)
>
>   testValue: 4. | splits into:  ['4', '']
>   value:  4
>   It's an <int> {:o)~

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接