子类化Django UpdateCacheMiddleware和FetchFromCacheMiddleware的技巧

7
我使用了UpdateCacheMiddlewareFetchFromCacheMiddleware中间件来启用全站匿名缓存,但效果参差不齐。其中最大的问题是该中间件只会缓存匿名用户的第一个请求。由于在该响应中设置了session_id cookie,因此随后的匿名用户请求由于视图级别缓存因Headers而不会命中缓存。
我的网页在匿名用户之间没有实质性的区别,至少它们不会影响缓存。如果有区别,我可以通过Ajax处理。因此,我决定尝试子类化Django的缓存中间件,不再基于Header进行变更,而是基于匿名用户和已登录用户进行变更。由于我正在使用Auth后端,并且该处理程序在从缓存获取数据之前发生,所以它似乎有效。
class AnonymousUpdateCacheMiddleware(UpdateCacheMiddleware):

    def process_response(self, request, response):
        """
        Sets the cache, if needed.
        We are overriding it in order to change the behavior of learn_cache_key().
        """

        if not self._should_update_cache(request, response):
            # We don't need to update the cache, just return.
            return response
        if not response.status_code == 200:
            return response

        timeout = get_max_age(response)
        if timeout == None:
            timeout = self.cache_timeout
        elif timeout == 0:
            # max-age was set to 0, don't bother caching.
            return response
        patch_response_headers(response, timeout)
        if timeout:
            ######### HERE IS WHERE IT REALLY GOES DOWN #######
            cache_key = self.learn_cache_key(request, response, self.cache_timeout, self.key_prefix, cache=self.cache)
            if hasattr(response, 'render') and callable(response.render):
                response.add_post_render_callback(
                    lambda r: self.cache.set(cache_key, r, timeout)
                )
            else:
                self.cache.set(cache_key, response, timeout)
        return response

    def learn_cache_key(self, request, response, timeout, key_prefix, cache=None):
        """_generate_cache_header_key() creates a key for the given request path, adjusted for locales.

            With this key, a new cache key is set via _generate_cache_key() for the HttpResponse

            The subsequent anonymous request to this path hits the FetchFromCacheMiddleware in the
            request capturing phase, which then looks up the headerlist value cached here on the initial response.

            FetchFromMiddleWare calcuates a cache_key based on the values of the listed headers using _generate_cache_key
            and then looks for the response stored under that key.  If the headers are the same as those
            set here, there will be a cache hit and the cached HTTPResponse is returned.
        """

        key_prefix = key_prefix or settings.CACHE_MIDDLEWARE_KEY_PREFIX
        cache_timeout = self.cache_timeout or settings.CACHE_MIDDLEWARE_SECONDS
        cache = cache or get_cache(settings.CACHE_MIDDLEWARE_ALIAS)

        cache_key = _generate_cache_header_key(key_prefix, request)

        # Django normally varies caching by headers so that authed/anonymous users do not see same pages
        # This makes Google Analytics cookies break caching;
        # It also means that different anonymous session_ids break caching, so only first anon request works
        # In this subclass, we are ignoring headers and instead varying on authed vs. anonymous users
        # Alternatively, we could also strip cookies potentially for the same outcome

        # if response.has_header('Vary'):
        #     headerlist = ['HTTP_' + header.upper().replace('-', '_')
        #                   for header in cc_delim_re.split(response['Vary'])]
        # else:
        headerlist = []

        cache.set(cache_key, headerlist, cache_timeout)
        return _generate_cache_key(request, request.method, headerlist, key_prefix)

获取器(Fetcher)负责从缓存中检索页面,其代码如下:
class AnonymousFetchFromCacheMiddleware(FetchFromCacheMiddleware):

    def process_request(self, request):
        """
        Checks whether the page is already cached and returns the cached
        version if available.
        """
        if request.user.is_authenticated():
            request._cache_update_cache = False
            return None
        else:
            return super(SmarterFetchFromCacheMiddleware, self).process_request(request)

显然, UpdateCacheMiddleware 的复制很多。 我无法找到更好的钩子来使它更简洁。

这个方法一般看起来怎么样? 有什么明显的问题吗?

谢谢, Ben

1个回答

2

您可以通过暂时从 response['Vary'] 中删除不需要的 vary 字段来解决此问题:

from django.utils.cache import cc_delim_re

class AnonymousUpdateCacheMiddleware(UpdateCacheMiddleware):
    def process_response(self, request, response):
        vary = None
        if not request.user.is_authenticated() and response.has_header('Vary'):
                vary = response['Vary']
                # only hide cookie here, add more as your usage
                response['Vary'] = ', '.join(
                    filter(lambda v: v != 'cookie', cc_delim_re.split(vary))
        response = super(AnonymousUpdateCacheMiddleware, self).process_response(request, response)
        if vary is not None:
            response['Vary'] = vary
        return response

此外,在设置中设置CACHE_MIDDLEWARE_ANONYMOUS_ONLY = True以防止对已认证用户进行缓存。

@Ben 没有太大的区别。但是与复制代码相比,我更喜欢在这里使用继承。它更清晰,在您的项目中有更少的代码,并且在升级Django版本时会引起更少的麻烦。 - okm

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接