SQLAlchemy 和 SQLite 共享缓存

Question

SQLAlchemy 和 SQLite 共享缓存

20

SQLite支持"共享缓存"的:memory:数据库，当使用特殊URI打开时(根据sqlite.org):

两个或多个数据库连接可以按以下方式打开相同的内存数据库：

rc = sqlite3_open("file::memory:?cache=shared",&db);

在Python 3.4中，我可以通过使用sqlite3.connect()的URI参数来利用它:

sqlite3.connect('file::memory:?cache=shared', uri=True)

然而，我似乎无法让SQLAlchemy实现同样的功能：

engine = sqlalchemy.create_engine('sqlite:///:memory:?cache=shared')
engine.connect()
...
TypeError: 'cache' is an invalid keyword argument for this function

有没有办法让SQLAlchemy利用共享缓存？

编辑：
在Python 3.4上，我可以使用create_engine的creator参数来解决这个问题，但是在其他Python版本上仍然存在问题：

creator = lambda: sqlite3.connect('file::memory:?cache=shared', uri=True)
engine = sqlalchemy.create_engine('sqlite://', creator=creator)
engine.connect()

- bbayles

4

我认为你无法在不将新的sqlite3模块功能移植到早期版本的情况下完成此操作。 - codeape

2个回答

8

SQLAlchemy文档关于SQLite方言详细描述了问题及其解决方法：

Threading/Pooling Behavior

Pysqlite’s default behavior is to prohibit the usage of a single connection in more than one thread. This is originally intended to work with older versions of SQLite that did not support multithreaded operation under various circumstances. In particular, older SQLite versions did not allow a :memory: database to be used in multiple threads under any circumstances.

Pysqlite does include a now-undocumented flag known as check_same_thread which will disable this check, however note that pysqlite connections are still not safe to use in concurrently in multiple threads. In particular, any statement execution calls would need to be externally mutexed, as Pysqlite does not provide for thread-safe propagation of error messages among other things. So while even :memory: databases can be shared among threads in modern SQLite, Pysqlite doesn’t provide enough thread-safety to make this usage worth it.

SQLAlchemy sets up pooling to work with Pysqlite’s default behavior:

When a :memory: SQLite database is specified, the dialect by default will use SingletonThreadPool. This pool maintains a single connection per thread, so that all access to the engine within the current thread use the same :memory: database - other threads would access a different :memory: database.

When a file-based database is specified, the dialect will use NullPool as the source of connections. This pool closes and discards connections which are returned to the pool immediately. SQLite file-based connections have extremely low overhead, so pooling is not necessary. The scheme also prevents a connection from being used again in a different thread and works best with SQLite’s coarse-grained file locking.

Using a Memory Database in Multiple Threads

To use a :memory: database in a multithreaded scenario, the same connection object must be shared among threads, since the database exists only within the scope of that connection. The StaticPool implementation will maintain a single connection globally, and the check_same_thread flag can be passed to Pysqlite as False:
from sqlalchemy.pool import StaticPool
engine = create_engine('sqlite://',
              connect_args={'check_same_thread':False},
              poolclass=StaticPool)
Note that using a :memory: database in multiple threads requires a recent version of SQLite.

来源：https://docs.sqlalchemy.org/en/13/dialects/sqlite.html#threading-pooling-behavior

本文介绍了SQLAlchemy在SQLite数据库中的线程池行为。在单线程模式下，所有连接将由同一个线程处理；在多线程模式下，每个线程都将有自己的连接。此外，当使用共享内存（例如：:memory:）时，线程安全性会受到限制。

- Hubert Grzeskowiak

1

亲爱的上帝，每个_thread_（而不是_process_）都有自己独立的DB（无法访问另一个线程的DB），这一点非常微妙。这篇文章为我节省了数天的工作量。 - ijoseph

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Álvaro Justen · Accepted Answer

您应该避免在旧的Python版本中传递 uri=True，这样问题就会得到解决：

import sqlite3
import sys

import sqlalchemy


DB_URI = 'file::memory:?cache=shared'
PY2 = sys.version_info.major == 2
if PY2:
    params = {}
else:
    params = {'uri': True}

creator = lambda: sqlite3.connect(DB_URI, **params)

engine = sqlalchemy.create_engine('sqlite:///:memory:', creator=creator)
engine.connect()