Neo4J Java Bolt创建节点速度缓慢。如何提高速度?

4

我正在尝试使用以下代码将一堆节点插入Neo4J:

import org.neo4j.driver.v1.*;

public class BoltClass
{
    public static void minimalWorkingExample() throws Exception
    {
        Driver driver = GraphDatabase.driver( "bolt://localhost", AuthTokens.basic( "neo4j", "admin4j" ) );
        Session session = driver.session();

        int k=0;

        for (int i = 0; i < 1000; i++) {
            int count = 1000;
            long begin = System.currentTimeMillis();
            for (int j = 0; j < count; j ++) {
                session.run("CREATE (a:Person {id:" + k + ", name:'unknown'})");
            }
            long end = System.currentTimeMillis();
            System.out.print("Inserting " + (double)count/((double)(end-begin)/count) + " nodes per second.\n");
            k++;
        }

        session.close();
        driver.close();
    }

    public static void main(String[] args)
    {
        try {
            minimalWorkingExample();
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

结果:

Inserting 58.8235294117647 nodes per second.
Inserting 76.92307692307692 nodes per second.
Inserting 50.0 nodes per second.
Inserting 76.92307692307692 nodes per second.
Inserting 55.55555555555556 nodes per second.
Inserting 62.5 nodes per second.
Inserting 66.66666666666667 nodes per second.
Inserting 55.55555555555556 nodes per second.
Inserting 62.5 nodes per second.
Inserting 55.55555555555556 nodes per second.
Inserting 47.61904761904762 nodes per second.
Inserting 45.45454545454545 nodes per second.
Inserting 58.8235294117647 nodes per second.
Inserting 83.33333333333333 nodes per second.

我正在使用Neo4j 3.0.3和org.neo4j.driver 1.0.4。 在插入之前,图表为空。 所使用的机器具有i5 2-2.6GHz CPU和8GB RAM。
提示:我刚刚发现了事务。
public static void TransactionExample() throws Exception
{
    Driver driver = GraphDatabase.driver( "bolt://localhost", AuthTokens.basic( "neo4j", "admin4j" ) );
    Session session = driver.session();

    int k=0;
    for (int i = 0; i < 1000; i++) {
        int count = 1000;
        long begin = System.currentTimeMillis();

        try ( Transaction tx = session.beginTransaction() )
        {
            for (int j = 0; j < count; j ++) {
                tx.run("CREATE (a:Person {id:" + k + ", name:'unknown'})");
            }
            tx.success();
        }
        long end = System.currentTimeMillis();
        System.out.print("Inserting " + (double)count/((double)(end-begin)/count) + " nodes per second.\n");
        k++;
    }
    session.close();
    driver.close();
}

结果:

Inserting 20000.0 nodes per second.
Inserting 17857.142857142855 nodes per second.
Inserting 18867.924528301886 nodes per second.
Inserting 15384.615384615385 nodes per second.
Inserting 19607.843137254902 nodes per second.
Inserting 16666.666666666668 nodes per second.
Inserting 16393.44262295082 nodes per second.

性能提升不错,还能再进一步改善吗?

2个回答

3

对于你的语句,还应该使用参数{id}{name},否则Cypher将不得不重新解析和编译每个查询,这会增加时间。使用参数可以一次编译并重复使用已编译的计划。

在内部循环中还应该递增变量k

public static void TransactionExample() throws Exception
{
    Driver driver = GraphDatabase.driver("bolt://localhost", AuthTokens.basic("neo4j", "admin4j"));
    Session session = driver.session();
    int k=0;
    String query = "CREATE (a:Person {id:{id}, name:{name}})";
    for (int i = 0; i < 1000; i++) {
        int count = 1000;
        long begin = System.currentTimeMillis();

        try (Transaction tx = session.beginTransaction())
        {
            for (int j = 0; j < count; j++) {
                tx.run(query, Values.parameters("id", k, "name", unknown));
                k++;
            }
            tx.success();
        }
        long end = System.currentTimeMillis();
        System.out.print("Inserting " + (double)count/((double)(end-begin)/count) + " nodes per second.\n");
    }
    session.close();
    driver.close();
}

很好的改进。谢谢你。请在CYPHER语句中添加一个缺失的花括号,因为它不允许我编辑你的帖子。 - Nick
很好的回答。用 try (...) 包装事务是否意味着 tx.close() 会自动调用? - Aviran Katz
@AviranKatz 显然是的。session javadoc 表示:“当此方法返回时,会话中所有未完成的语句都已完成,这意味着您执行的任何写入操作都已经持久存储。” - Johannes
从v4.x.x版本开始,请注意旧的参数语法{param}不再受支持。请改用$param,例如"CREATE (a:Person {id:$id, name:$name})"。 - Klajd Deda

1

尽管事先在:Person(id)上添加索引(或唯一约束)可能不会加速插入(它甚至可能会稍微减慢插入,因为它将不得不更新索引),但它应该显著加速任何后续需要通过ID匹配:Person节点的操作,例如从:Person到其他节点添加关系或向个人添加属性。


网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接