cProfile和profile之间的区别——为什么没有profile.Profile.enable()方法?

4

在Python 2.7和Python 3.6中,我发现这个代码可以运行:

from cProfile import Profile; p = Profile(); p.enable()

...而这会引发异常:

from profile import Profile; p = Profile(); p.enable()

# -->  AttributeError: 'Profile' object has no attribute 'enable'

这让我感到惊讶,因为我认为(Python 3Python 2的官方文档也表示),这两个模块应该提供相同的程序员接口:

profilecProfile 模块都提供以下函数:

...

class profile.Profile(timer=None, timeunit=0.0, subcalls=True, builtins=True)

...

enable()

开始收集性能分析数据。

我错过了什么?使用 profile.Profile() 实例的正确方法是什么?

1
你没有错过任何东西。我也发现这种情况,并通过查看CPython源代码进行了验证。然而,我不知道为什么会这样,只知道事情确实是这样写的。 - martineau
@martineau +1 什么是使用 profile.Profile() 的正确方式 -- 经过验证,Python 3.7 和 3.8 上也存在相应行为 - Jon
@jon:当前Python 3.x文档中关于Profile.enable()方法的说明称它仅在cProfile中可用,这意味着您必须使用profile.Profile的一个run*()方法。这里有一些示例 - martineau
是的,我在使用inspect.getmembers()检查profile后发现了run()方法的存在。我会发布一个带有这些信息的答案。 - Jon
@jon:顺便提一下,当回复评论不是由OP发表时,你需要添加以@为前缀的用户名,这样他们就会被通知到你的回复。 我在这种情况下仅仅是碰巧找到了你的回复(因为有时我会检查最近我自己所做的评论是否有任何回应)。 - martineau
1个回答

1
截至 Python 3.8 及更新的文档,已经修复了这个问题并指出 enable()disable() 仅适用于 cProfile。对于 profile.Profile,你需要使用 run() 调用而非 enable()disable()。以下是每个示例:

cProfile.Profile:

# cProfile.Profile -- control with enable() and disable()
from datetime import datetime
import cProfile, pstats, io
sortby = 'cumulative'
pr1 = cProfile.Profile(lambda: int(datetime.now().timestamp()*1000000), 0.000001)
pr1.enable()
list(x for x in range(int(10*10*10*10/10*10+10)))
pr1.disable()
s1 = io.StringIO()
ps1 = pstats.Stats(pr1, stream=s1).sort_stats(sortby)
ps1.print_stats()
print(s1.getvalue())

profile.Profile:

# profile.Profile -- control with run()
from datetime import datetime
import profile, pstats, io
sortby = 'cumulative'
pr2 = profile.Profile(datetime.now().timestamp, 0.000001)
pr2.run('list(x for x in range(int(10*10*10*10/10*10+10)))')
s2 = io.StringIO()
ps2 = pstats.Stats(pr2, stream=s2).sort_stats(sortby)
ps2.print_stats()
print(s2.getvalue())

以下是我观察到的两种形式之间的行为差异:

  • cProfile will push profiled functions' output to stdout, whereas profile will not.

  • The final report's filename:lineno(function) column will have slightly different output. cProfile generally will display built-in functions more readably.

    • In the example I provided here, however, the passed in function is actually identifiable in profile's output and not in cProfile's -- but if using a named function, it is identifiable in both profilers' output (although the passed in arguments to the function are only visible from profile's output).
  • Running identical functions thru both cProfile and profile provides slightly different results. It seems cProfile generates about two less function calls generally, and will provide more efficient times for the same function.

  • Using the profilers' optional custom timer and timeunit parameters can offer the benefit of gaining microsecond precision vs the default millisecond precision. However, it seems that attempting to use these comes with some caveats:

    • cProfile's timeunit function requires an integer, whereas profile expects a floating point number. This means you can't use datetime.now().timestamp() directly with cProfile, but must rather wrap it in a lambda to convert it to a full integer first.
    • Additionally, for some reason profile seems to not work correctly with datetime.now().timestamp() even though that's in the format it's expecting. It will print out negative time values -- I show this below. Switching to the lambda form to convert it to integer doesn't help, as profile expects the floating point number.
  • Below is the output of profile.Profile with the datetime.now().timestamp() timer for microsecond provision.

     print(s2.getvalue())
           10015 function calls in -0.020 seconds`
    
     Ordered by: cumulative time
    
     ncalls  tottime  percall  cumtime  percall filename:lineno(function)
          0    0.000             0.000          profile:0(profiler)
          1    0.000    0.000    0.000    0.000 :0(setprofile)
      10011   -0.010   -0.000   -0.010   -0.000 <string>:1(<genexpr>)
          1   -0.010   -0.010   -0.020   -0.020 <string>:1(<module>)
          1   -0.000   -0.000   -0.020   -0.020 :0(exec)
          1   -0.000   -0.000   -0.020   -0.020 profile:0(list(x for x in range(int(10*10*10*10/10*10+10))))
    
  • However, this problem can be fixed by providing a custom timer function (which can even wrap around datetime.now() to provide the exact same output format as datetime.now().timestamp()), for reasons I can't explain.

例如,如果我们定义:
def timenow():
   now = datetime.now()
   return (((now.year-1970) * 31557600) + (now.month * 2629800) + 
           (now.day * 86400) + (now.hour * 3600) + (now.minute * 60) +
           now.second + (now.microsecond * 0.000001))

我们现在可以使用pr2 = profile.Profile(timenow, 0.000001),这将输出以下(健康的)报告:
>>> print(s2.getvalue())
         10015 function calls in 0.041 seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.006    0.006    0.041    0.041 profile:0(list(x for x in range(int(10*10*10*10/10*10+10))))
        1    0.000    0.000    0.035    0.035 :0(exec)
        1    0.017    0.017    0.035    0.035 <string>:1(<module>)
    10011    0.018    0.000    0.018    0.000 <string>:1(<genexpr>)
        1    0.001    0.001    0.001    0.001 :0(setprofile)
        0    0.000             0.000          profile:0(profiler)

与使用上面定义的自定义lambda计时器的cProfile输出相比,我们有:
>>> print(s1.getvalue())
         10013 function calls in 0.031 seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    10011    0.031    0.000    0.031    0.000 <stdin>:1(<genexpr>)
        1    0.000    0.000    0.000    0.000 <stdin>:1(<module>)
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

关于为什么我们的自定义计时器函数给出了合理的结果而datetime.now().timestamp()profile不同,首先必须注意到timenow不提供datetime.now().timestamp()的计时。虽然它提供接近纪元时间,但是偏差很小(在同一系统上,增量将相同,不应影响正常的配置文件运行)。timenow中使用的手动转换肯定也不像datetime.now().timestamp()那样高效。我只能猜测后者过于高效,某种程度上导致profile源代码无法正确测量。

简而言之

除非您有特定的原因,否则请使用cProfile。官方Python文档还指出,推荐使用模块是cProfile。虽然没有提供具体的理由,但该文档似乎暗示cProfile可能表现更好,并适用于大多数用户的目的(即使它是稍微冗长的语法)。


网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接