我们的一个应用程序刚刚遭受了一些麻烦的死锁。我花了很长时间来重新创建这个问题,因为在我的Java应用程序日志中,死锁(或堆栈跟踪)并没有立即显示出来。
让我感到惊讶的是,Marklogic Java API会重试失败的请求(例如由于死锁)。如果您的请求不是多语句请求,这可能是有意义的,但否则我不确定它是否有效。
所以让我们继续解决这个死锁问题。我创建了一个简单的代码片段,在其中有意地创建了一个死锁。该片段创建一个名为“test.xml”的文档,然后尝试从两个不同的事务中读取和写入,每个事务都在新线程上运行。
public static void main(String[] args) throws Exception {
final Logger root = (Logger) LoggerFactory.getLogger(Logger.ROOT_LOGGER_NAME);
final Logger ok = (Logger) LoggerFactory.getLogger(OkHttpServices.class);
root.setLevel(Level.ALL);
ok.setLevel(Level.ALL);
final DatabaseClient client = DatabaseClientFactory.newClient("localhost", 8000, new DatabaseClientFactory.DigestAuthContext("username", "password"));
final StringHandle handle = new StringHandle("<doc><name>Test</name></doc>")
.withFormat(Format.XML);
client.newTextDocumentManager().write("test.xml", handle);
root.info("t1: opening");
final Transaction t1 = client.openTransaction();
root.info("t1: reading");
client.newXMLDocumentManager()
.read("test.xml", new StringHandle(), t1);
root.info("t2: opening");
final Transaction t2 = client.openTransaction();
root.info("t2: reading");
client.newXMLDocumentManager()
.read("test.xml", new StringHandle(), t2);
new Thread(() -> {
root.info("t1: writing");
client.newXMLDocumentManager().write("test.xml", new StringHandle("<doc><t>t1</t></doc>").withFormat(Format.XML), t1);
t1.commit();
}).start();
new Thread(() -> {
root.info("t2: writing");
client.newXMLDocumentManager().write("test.xml", new StringHandle("<doc><t>t2</t></doc>").withFormat(Format.XML), t2);
t2.commit();
}).start();
TimeUnit.MINUTES.sleep(5);
client.release();
}
这段代码将产生以下日志:
14:12:27.437 [main] DEBUG c.m.client.impl.OkHttpServices - Connecting to localhost at 8000 as admin
14:12:27.570 [main] DEBUG c.m.client.impl.OkHttpServices - Sending test.xml document in transaction null
14:12:27.608 [main] INFO ROOT - t1: opening
14:12:27.609 [main] DEBUG c.m.client.impl.OkHttpServices - Opening transaction
14:12:27.962 [main] INFO ROOT - t1: reading
14:12:27.963 [main] DEBUG c.m.client.impl.OkHttpServices - Getting test.xml in transaction 5298588351036278526
14:12:28.283 [main] INFO ROOT - t2: opening
14:12:28.283 [main] DEBUG c.m.client.impl.OkHttpServices - Opening transaction
14:12:28.286 [main] INFO ROOT - t2: reading
14:12:28.286 [main] DEBUG c.m.client.impl.OkHttpServices - Getting test.xml in transaction 8819382734425123844
14:12:28.289 [Thread-1] INFO ROOT - t1: writing
14:12:28.289 [Thread-1] DEBUG c.m.client.impl.OkHttpServices - Sending test.xml document in transaction 5298588351036278526
14:12:28.289 [Thread-2] INFO ROOT - t2: writing
14:12:28.290 [Thread-2] DEBUG c.m.client.impl.OkHttpServices - Sending test.xml document in transaction 8819382734425123844
t1
和t2
都不会被提交。MarkLogic日志确认实际上存在死锁:
==> /var/opt/MarkLogic/Logs/8000_AccessLog.txt <==
127.0.0.1 - admin [24/Nov/2018:14:12:30 +0000] "PUT /v1/documents?txid=5298588351036278526&category=content&uri=test.xml HTTP/1.1" 503 1034 - "okhttp/3.9.0"
==> /var/opt/MarkLogic/Logs/ErrorLog.txt <==
2018-11-24 14:12:30.719 Info: Deadlock detected locking Documents test.xml
如果其中一个请求失败并抛出异常,这将不是问题,但事实并非如此。MarkLogic Java Api会重试每个请求,最多达到120秒,而其中一个更新在大约120秒后超时:
Exception in thread "Thread-1" com.marklogic.client.FailedRequestException: Service unavailable and maximum retry period elapsed: 121 seconds after 65 retries
at com.marklogic.client.impl.OkHttpServices.putPostDocumentImpl(OkHttpServices.java:1422)
at com.marklogic.client.impl.OkHttpServices.putDocument(OkHttpServices.java:1256)
at com.marklogic.client.impl.DocumentManagerImpl.write(DocumentManagerImpl.java:920)
at com.marklogic.client.impl.DocumentManagerImpl.write(DocumentManagerImpl.java:758)
at com.marklogic.client.impl.DocumentManagerImpl.write(DocumentManagerImpl.java:717)
at Scratch.lambda$main$0(scratch.java:40)
at java.lang.Thread.run(Thread.java:748)
有哪些可能的解决方法?一种方法可能是为事务设置最大生存时间(比如5秒),但这感觉有点像临时抱佛脚,不太可靠。还有其他的想法吗?我应该检查其他的设置吗?
我使用的是MarkLogic 9.0-7.2和marklogic-client-api:4.0.3。
编辑:解决死锁的一种方法是同步调用函数,这实际上是我在我的情况下解决问题的方法(请参见注释)。但我认为根本问题仍然存在。在多语句事务中出现死锁不应该被隐藏在120秒的超时之后。我宁愿要一个立即失败的请求,而不是一个120秒的文档锁定+每个线程64次失败重试。