尽管队列看起来空空如也,但Celery Redis实例仍在填充。

7
我们有一个Django应用程序,需要使用Celery获取大量数据。每隔几分钟就会有20个左右的Celery工作人员在运行。我们在Google Kubernetes Engine上运行,使用Cloud Memorystore的Redis队列。
我们用的Redis实例正在填满,即使根据Flower的显示队列已经为空。这导致Redis数据库最终被填满,Celery抛出错误。
在Flower中,我看到任务进进出出,并且已经将工作人员增加到了队列始终为空的程度。
如果我运行redis-cli --bigkeys,我可以看到:

# Scanning the entire keyspace to find biggest keys as well as
# average sizes per key type.  You can use -i 0.1 to sleep 0.1 sec
# per 100 SCAN commands (not usually needed).

[00.00%] Biggest set    found so far '_kombu.binding.my-queue-name-queue' with 1 members
[00.00%] Biggest list   found so far 'default' with 611 items
[00.00%] Biggest list   found so far 'my-other-queue-name-queue' with 44705 items
[00.00%] Biggest set    found so far '_kombu.binding.celery.pidbox' with 19 members
[00.00%] Biggest list   found so far 'my-queue-name-queue' with 727179 items
[00.00%] Biggest set    found so far '_kombu.binding.celeryev' with 22 members

-------- summary -------

Sampled 12 keys in the keyspace!
Total key length in bytes is 271 (avg len 22.58)

Biggest   list found 'my-queue-name-queue' has 727179 items
Biggest    set found '_kombu.binding.celeryev' has 22 members

4 lists with 816144 items (33.33% of keys, avg size 204036.00)
0 hashs with 0 fields (00.00% of keys, avg size 0.00)
0 strings with 0 bytes (00.00% of keys, avg size 0.00)
0 streams with 0 entries (00.00% of keys, avg size 0.00)
8 sets with 47 members (66.67% of keys, avg size 5.88)
0 zsets with 0 members (00.00% of keys, avg size 0.00)

如果我使用LRANGE检查队列,我会看到很多这样的对象:

"{\"body\": \"W1syNDQ0NF0sIHsicmVmZXJlbmNlX3RpbWUiOiBudWxsLCAibGF0ZXN0X3RpbWUiOiBudWxsLCAicm9sbGluZyI6IGZhbHNlLCAidGltZWZyYW1lIjogIjFkIiwgIl9udW1fcmV0cmllcyI6IDF9LCB7ImNhbGxiYWNrcyI6IG51bGwsICJlcnJiYWNrcyI6IG51bGwsICJjaGFpbiI6IG51bGwsICJjaG9yZCI6IG51bGx9XQ==\", \"content-encoding\": \"utf-8\", \"content-type\": \"application/json\", \"headers\": {\"lang\": \"py\", \"task\": \"MyDataCollectorClass\", \"id\": \"646910fc-f9db-48c3-b5a9-13febbc00bde\", \"shadow\": null, \"eta\": \"2019-08-20T02:31:05.113875+00:00\", \"expires\": null, \"group\": null, \"retries\": 0, \"timelimit\": [null, null], \"root_id\": \"beeff557-66be-451d-9c0c-dc622ca94493\", \"parent_id\": \"374d8e3e-92b5-423e-be58-e043999a1722\", \"argsrepr\": \"(24444,)\", \"kwargsrepr\": \"{'reference_time': None, 'latest_time': None, 'rolling': False, 'timeframe': '1d', '_num_retries': 1}\", \"origin\": \"gen1@celery-my-queue-name-worker-6595bd8fd8-8vgzq\"}, \"properties\": {\"correlation_id\": \"646910fc-f9db-48c3-b5a9-13febbc00bde\", \"reply_to\": \"e55a31ed-cbba-3d79-9ffc-c19a29e77aac\", \"delivery_mode\": 2, \"delivery_info\": {\"exchange\": \"\", \"routing_key\": \"my-queue-name-queue\"}, \"priority\": 0, \"body_encoding\": \"base64\", \"delivery_tag\": \"a83074a5-8787-49e3-bb7d-a0e69ba7f599\"}}"

我们正在使用django-celery-results存储结果,所以这些不应该在那里,而我们正在为Django的缓存使用单独的Redis实例。

如果我用 FLUSHALL 清除Redis,它会慢慢填满。

我不太清楚下一步该怎么做。我对Redis不是很了解,也许我可以做些什么来检查数据以查看是什么在占用空间?也许是Flower没有正确报告?也许Celery会保留已完成任务的一段时间,尽管我们使用Django数据库来存储结果? 感谢任何帮助。

2
你是否找到解决办法了? - zerohedge
2个回答

1
听起来Redis没有设置删除已完成的项目或报告和删除失败的项目 - 也就是说,它可能会将任务放在列表中,但不会将其移除。
查看pypi包:rq,django-rq,django-rq-scheduler 您可以在这里阅读一些关于如何工作的信息:https://python-rq.org/docs/

0

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接