强制Mediawiki使Squid缓存填满所有页面

Question

强制Mediawiki使Squid缓存填满所有页面

apachecachingmediawikisquid

5

如果想加速使用大量模板但内容基本静态的MediaWiki网站，我希望设置一个Squid服务器。详情请参见

https://www.mediawiki.org/wiki/Manual:PurgeList.php

并且

https://www.mediawiki.org/wiki/Manual:Squid_caching

然后，使用一个脚本进行wget/curl调用，访问Mediawiki的所有页面，以“自动”方式填充鱿鱼服务器的缓存。我的期望是，在此过程之后，每个页面都在鱿鱼缓存中（如果我使其足够大），然后每次访问都将由鱿鱼处理。 如何使其工作？ 例如：

如何检查我的配置？
如何找出需要多少内存？
如何检查页面是否在squid3缓存中？

我到目前为止尝试了什么

我首先找出了如何使用以下命令安装鱿鱼：

https://wiki.ubuntuusers.de/squid

并且

https://www.mediawiki.org/wiki/Manual:Squid_caching

我通过ifconfig eth0找到了我的IP地址xx.xxx.xxx.xxx（此处未公开）。

在/etc/squid3/squid.conf中，我放置了以下内容：

http port xx.xxx.xxx.xxx:80 transparent vhost defaultsite=XXXXXX
cache_peer 127.0.0.1 parent 80 3130 originserver 

acl manager proto cache_object
acl localhost src 127.0.0.1/32

# Allow access to the web ports
acl web_ports port 80
http_access allow web_ports

# Allow cachemgr access from localhost only for maintenance purposes
http_access allow manager localhost
http_access deny manager

# Allow cache purge requests from MediaWiki/localhost only
acl purge method PURGE
http_access allow purge localhost
http_access deny purge

# And finally deny all other access to this proxy
http_access deny all

然后我配置了我的apache2服务器。

# /etc/apache2/sites-enabled/000-default.conf   
Listen 127.0.0.1:80

我添加了。

$wgUseSquid = true;
$wgSquidServers = array('xx.xxx.xxx.xxx');
$wgSquidServersNoPurge = array('127.0.0.1');

到我的LocalSettings.php文件中

然后我重新启动了apache2，并使用以下命令启动了squid3

service squid3 restart

并尝试首次访问

wget --cache=off -r http://XXXXXX/mediawiki

结果是：

Resolving XXXXXXX (XXXXXXX)... xx.xxx.xxx.xxx
Connecting to XXXXXXX (XXXXXXX|xx.xxx.xx.xxx|:80... failed: Connection refused.

- Wolfgang Fahl

1个回答

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Drew Anderson · Answer 1

假设使用 Apache 2.x。

虽然与 Squid 无关，但您可以仅使用 Apache 模块来实现此目的。请查看 mod_cache，链接在这里：https://httpd.apache.org/docs/2.2/mod/mod_cache.html

您只需将其添加到 Apache 配置中，并要求 Apache 对呈现内容进行磁盘缓存。

您需要确保您的内容在生成的 PHP 响应中具有适当的缓存过期信息，MediaWiki 应该为您处理此问题。

添加此类缓存层可能不会产生预期的结果，因为此层不知道页面是否已更改，缓存管理在此处很困难，应仅用于实际静态内容。

Ubuntu：

a2enmod cache cache_disk

Apache 配置：

CacheRoot /var/cache/apache2/mod_disk_cache
CacheEnable disk /

我不建议通过访问每个页面来预先填充缓存。这只会导致不经常使用的页面占用宝贵的空间/内存。如果您仍然希望这样做，可以考虑使用wget：

Description from: http://www.linuxjournal.com/content/downloading-entire-web-site-wget
$ wget \
     --recursive \
     --no-clobber \
     --page-requisites \
     --html-extension \
     --convert-links \
     --restrict-file-names=windows \
     --domains website.org \
     --no-parent \
         www.website.org/tutorials/html/

This command downloads the Web site www.website.org/tutorials/html/.

The options are:

    --recursive: download the entire Web site.

    --domains website.org: don't follow links outside website.org.

    --no-parent: don't follow links outside the directory tutorials/html/.

    --page-requisites: get all the elements that compose the page (images, CSS and so on).

    --html-extension: save files with the .html extension.

    --convert-links: convert links so that they work locally, off-line.

    --restrict-file-names=windows: modify filenames so that they will work in Windows as well.

    --no-clobber: don't overwrite any existing files (used in case the download is interrupted and
    resumed).

更好的选择：Memcached

MediaWiki还支持使用Memcached作为非常快速的内存缓存服务，仅用于数据和模板。这不像Squid或Apache mod_cache这样的全站点缓存那么严格。MediaWiki将管理Memcached，以便任何更改都会立即反映在缓存存储中，这意味着您的内容始终有效。

请参阅MediaWiki上的安装说明：https://www.mediawiki.org/wiki/Memcached 我的建议是不要使用Apache mod_cache或Squid来完成此任务，而是安装Memcached并配置MediaWiki以使用它。