统计每个请求的重试次数

6
我使用requests包和urllib3.util.retry.Retry()一起发送数以万计的查询。我想要计算查询的数量和成功检索所需尝试的次数。我的目标是构建一个API可靠性的度量标准。
为了明确问题,假设requests的响应对象包含以下数据:
from requests import Session
from urllib3.util.retry import Retry
from requests.adapters import HTTPAdapter

def create_session():
    session = Session()
    retries = Retry(
        total = 15,
        backoff_factor = 0.5,
        status_forcelist = [401, 408, 429, 500, 502, 504],
        allowed_methods = frozenset(["GET"])
    )

    session.mount('http://', HTTPAdapter(max_retries=retries))
    session.mount('https://', HTTPAdapter(max_retries=retries))

    return session

urls = ['https://httpbin.org/status/500']
count_queries = len(urls)
count_attempts = 0

with create_session() as s:
    for url in urls:
        response = s.get(url)
        count_attempts += response.total_retries

由于没有这样的变量,我正在寻找替代方法来计算总重试次数。

虽然我无法确定解决这个问题的方法,但在搜索过程中,我做了以下观察,这可能有助于解决:

  • urllib3将重试历史记录存储在Retry对象中。最后的Retry对象存储在urllib3.HTTPResponse中(文档)。 requests.Response.raw 存储urllib3.HTTPResponse(确切地说是未解码的正文),但仅当stream=True时可以访问(文档)。 在我的理解中,我无法访问此数据。
  • 一位用户提供了一个解决类似问题的方案,该方案对Retry类进行了子类化。 基本上,调用回调函数会向记录器打印一个字符串。 这可以改为递增计数器而不是打印日志。 但是,如果可能的话,我更喜欢跟踪特定于特定get的重试,如上所示,而不是使用相同会话的所有get
  • 这里问了一个非常类似的问题,但没有提供(可行的)解决方案。

我正在使用Python 3.9、urllib3 1.26.8和requests 2.26.0。

1个回答

1

这是一个类似于此答案的冗长解决方案。它在会话级别上计算请求和重试次数(然而,这不是我的首选方法)。

import requests
from urllib3.util.retry import Retry

class RequestTracker:
    """ track queries and retries """
    def __init__(self):
        self._retries = 0
        self._queries = 0

    def register_retry(self):
        self._retries += 1

    def register_query(self):
        self._queries += 1

    @property
    def retries(self):
        return self._retries

    @property
    def queries(self):
        return self._queries

class RetryTracker(Retry):
    """ subclass Retry to track count of retries """
    def __init__(self, *args, **kwargs):
        self._request_tracker = kwargs.pop('request_tracker', None)
        super(RetryTracker, self).__init__(*args, **kwargs)
    
    def new(self, **kw):
        """ pass additional information when creating new Retry instance """
        kw['request_tracker'] = self._request_tracker
        return super(RetryTracker, self).new(**kw)
    
    def increment(self, method, url, *args, **kwargs):
        """ register retry attempt when new Retry object with incremented counter is returned """
        if self._request_tracker:
            self._request_tracker.register_retry()
        return super(RetryTracker, self).increment(method, url, *args, **kwargs)

class RetrySession(requests.Session):
    """ subclass Session to track count of queries """
    def __init__(self, retry):
        super().__init__()
        self._requests_count = retry

    def prepare_request(self, request):
        """ increment query counter """
        # increment requests counter
        self._requests_count.register_query()
        return super().prepare_request(request)

class RequestManager:
    """ manage requests """    
    def __init__(self, request_tracker=None):
        # session settings
        self.__session = None
        self.__request_tracker = request_tracker

        # retry logic specification
        args = dict(
            total = 11,
            backoff_factor = 1,
            status_forcelist = [401,408, 429, 500, 502, 504],
            allowed_methods = frozenset(["GET"])
        )
        if self.__request_tracker is not None:
            args['request_tracker'] = self.__request_tracker
            self.__retries = RetryTracker(**args)
        else:
            self.__retries = Retry(**args)
    
    @property
    def session(self):
        if self.__session is None:
            # create new session
            if self.__request_tracker is not None:
                self.__session = RetrySession(self.__request_tracker)
            else:
                self.__session = requests.Session()
            
            # mount https adapter with retry logic
            https = requests.adapters.HTTPAdapter(max_retries=self.__retries)
            self.__session.mount('https://', https)
        
        return self.__session
    
    @session.setter
    def session(self, value):
        raise AttributeError('Setting session attribute is prohibited.')

request_tracker = RequestTracker()
request_manager = RequestManager(request_tracker=request_tracker)
session = request_manager.session
urls = ['https://httpbin.org/status/500']

with session as s:
    for url in urls:
        response = s.get(url)

print(request_tracker.queries)
print(request_tracker.retries)

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接