Hibernate批量大小混淆

37

这个程序需要连续插入成千上万条数据。我以前从未使用过Hibernate。但是现在我发现性能非常慢(如果我手动连接并执行SQL语句,速度快10-12倍)。根据许多Hibernate教程,我的batch_size设置为50。

以下是一次插入的日志记录,希望您能帮助我理解具体发生了什么:

START INSERT
11:02:56.121 [main] DEBUG org.hibernate.impl.SessionImpl - opened session at timestamp: 13106053761                                                                                            
11:02:56.121 [main] DEBUG o.h.transaction.JDBCTransaction - begin                                                                                                                               
11:02:56.121 [main] DEBUG org.hibernate.jdbc.ConnectionManager - opening JDBC connection                                                                                                             
11:02:56.121 [main] TRACE o.h.c.DriverManagerConnectionProvider - total checked-out connections: 0                                                                                                    
11:02:56.121 [main] TRACE o.h.c.DriverManagerConnectionProvider - using pooled JDBC connection, pool size: 0                                                                                          
11:02:56.121 [main] DEBUG o.h.transaction.JDBCTransaction - current autocommit status: false                                                                                                    
11:02:56.121 [main] TRACE org.hibernate.jdbc.JDBCContext - after transaction begin                                                                                                             
11:02:56.121 [main] TRACE org.hibernate.impl.SessionImpl - setting flush mode to: MANUAL                                                                                                       
11:02:56.121 [main] TRACE o.h.e.def.DefaultLoadEventListener - loading entity: [com.xyzcompany.foo.edoi.ejb.msw000.MSW000Rec#component[keyW000]{keyW000=F000    ADSUFC}]                           
11:02:56.121 [main] TRACE o.h.e.def.DefaultLoadEventListener - creating new proxy for entity                                                                                                       
11:02:56.122 [main] TRACE o.h.e.d.DefaultSaveOrUpdateEventListener - saving transient instance                                                                                                           
11:02:56.122 [main] DEBUG o.h.e.def.AbstractSaveEventListener - generated identifier: component[keyW000]{keyW000=F000    ADSUFC}, using strategy: org.hibernate.id.CompositeNestedGeneratedValueGenerator
11:02:56.122 [main] TRACE o.h.e.def.AbstractSaveEventListener - saving [com.xyzcompany.foo.edoi.ejb.msw000.MSW000Rec#component[keyW000]{keyW000=F000    ADSUFC}]                                    
11:02:56.123 [main] TRACE o.h.e.d.AbstractFlushingEventListener - flushing session                                                                                                                    
11:02:56.123 [main] DEBUG o.h.e.d.AbstractFlushingEventListener - processing flush-time cascades                                                                                                      
11:02:56.123 [main] DEBUG o.h.e.d.AbstractFlushingEventListener - dirty checking collections                                                                                                          
11:02:56.123 [main] TRACE o.h.e.d.AbstractFlushingEventListener - Flushing entities and processing referenced collections                                                                             
11:02:56.125 [main] TRACE o.h.e.d.AbstractFlushingEventListener - Processing unreferenced collections                                                                                                 
11:02:56.125 [main] TRACE o.h.e.d.AbstractFlushingEventListener - Scheduling collection removes/(re)creates/updates                                                                                   
11:02:56.126 [main] DEBUG o.h.e.d.AbstractFlushingEventListener - Flushed: 1 insertions, 0 updates, 0 deletions to 62 objects                                                                         
11:02:56.126 [main] DEBUG o.h.e.d.AbstractFlushingEventListener - Flushed: 0 (re)creations, 0 updates, 0 removals to 0 collections                                                                                                                                                                                       
11:02:56.132 [main] TRACE o.h.e.d.AbstractFlushingEventListener - executing flush                                                                                                                     
11:02:56.132 [main] TRACE org.hibernate.jdbc.ConnectionManager - registering flush begin                                                                                                             
11:02:56.132 [main] TRACE o.h.p.entity.AbstractEntityPersister - Inserting entity: [com.xyzcompany.foo.edoi.ejb.msw000.MSW000Rec#component[keyW000]{keyW000=F000    ADSUFC}]                         
11:02:56.132 [main] DEBUG org.hibernate.jdbc.AbstractBatcher - about to open PreparedStatement (open PreparedStatements: 0, globally: 0)                                                           
11:02:56.132 [main] DEBUG org.hibernate.SQL - insert into MSW000 (W000_DATA_REC, W000_FILE_FLAGS, KEY_W000) values (?, ?, ?)                                                      
11:02:56.132 [main] TRACE org.hibernate.jdbc.AbstractBatcher - preparing statement                                                                                                                 
11:02:56.132 [main] TRACE o.h.p.entity.AbstractEntityPersister - Dehydrating entity: [com.xyzcompany.foo.edoi.ejb.msw000.MSW000Rec#component[keyW000]{keyW000=F000    ADSUFC}]                       
11:02:56.132 [main] TRACE org.hibernate.type.StringType - binding '    ADSUFCA                                                                                                                                                                                                                                                                                  ' to parameter: 1
11:02:56.132 [main] TRACE org.hibernate.type.StringType - binding ' ' to parameter: 2                                                                                                         
11:02:56.132 [main] TRACE org.hibernate.type.StringType - binding 'F000    ADSUFC' to parameter: 3                                                                                            
11:02:56.132 [main] DEBUG org.hibernate.jdbc.AbstractBatcher - Executing batch size: 1                                                                                                             
11:02:56.133 [main] DEBUG org.hibernate.jdbc.AbstractBatcher - about to close PreparedStatement (open PreparedStatements: 1, globally: 1)                                                          
11:02:56.133 [main] TRACE org.hibernate.jdbc.AbstractBatcher - closing statement                                                                                                                   
11:02:56.133 [main] TRACE org.hibernate.jdbc.ConnectionManager - registering flush end                                                                                                               
11:02:56.133 [main] TRACE o.h.e.d.AbstractFlushingEventListener - post flush                                                                                                                          
11:02:56.133 [main] DEBUG o.h.transaction.JDBCTransaction - commit                                                                                                                              
11:02:56.133 [main] TRACE org.hibernate.impl.SessionImpl - automatically flushing session                                                                                                      
11:02:56.133 [main] TRACE org.hibernate.jdbc.JDBCContext - before transaction completion                                                                                                       
11:02:56.133 [main] TRACE org.hibernate.impl.SessionImpl - before transaction completion                                                                                                       
11:02:56.133 [main] DEBUG o.h.transaction.JDBCTransaction - committed JDBC Connection                                                                                                           
11:02:56.133 [main] TRACE org.hibernate.jdbc.JDBCContext - after transaction completion                                                                                                        
11:02:56.133 [main] DEBUG org.hibernate.jdbc.ConnectionManager - transaction completed on session with on_close connection release mode; be sure to close the session to release JDBC resources!     
11:02:56.133 [main] TRACE org.hibernate.impl.SessionImpl - after transaction completion                                                                                                        
11:02:56.133 [main] TRACE org.hibernate.impl.SessionImpl - closing session                                                                                                                     
11:02:56.133 [main] TRACE org.hibernate.jdbc.ConnectionManager - performing cleanup                                                                                                                  
11:02:56.133 [main] DEBUG org.hibernate.jdbc.ConnectionManager - releasing JDBC connection [ (open PreparedStatements: 0, globally: 0) (open ResultSets: 0, globally: 0)]                            
11:02:56.133 [main] TRACE o.h.c.DriverManagerConnectionProvider - returning connection to pool, pool size: 1                                                                                          
11:02:56.133 [main] TRACE org.hibernate.jdbc.JDBCContext - after transaction completion                                                                                                        
11:02:56.133 [main] DEBUG org.hibernate.jdbc.ConnectionManager - transaction completed on session with on_close connection release mode; be sure to close the session to release JDBC resources!     
11:02:56.134 [main] TRACE org.hibernate.impl.SessionImpl - after transaction completion                                                                                                        
FINISH INSERT
3个回答

114
当你调用session.save()时,Hibernate会生成一个INSERT SQL。这个INSERT SQL将在刷新期间(即session.flush())附加到要发出到数据库的语句中。
在刷新期间,如果hibernate.jdbc.batch_size设置为非零值,Hibernate将使用JDBC2 API引入的批处理功能将批量插入SQL发送到数据库。
例如,如果你保存了100条记录,而你的hibernate.jdbc.batch_size设置为50。在刷新期间,不是发出以下SQL 100次:
insert into TableA (id , fields) values (1, 'val1');
insert into TableA (id , fields) values (2, 'val2');
insert into TableA (id , fields) values (3, 'val3');
.........................
insert into TableA (id , fields) values (100, 'val100');

Hibernate会将它们分成50个一组,并且只向数据库发出2个SQL,就像这样:

insert into TableA (id , fields) values (1, 'val1') , (2, 'val2') ,(3, 'val3') ,(4, 'val4') ,......,(50, 'val50')
insert into TableA (id , fields) values (51, 'val51') , (52, 'val52') ,(53, 'val53') ,(54, 'val54'),...... ,(100, 'val100')  

请注意,如果插入表的主键为GenerationType.Identity,Hibernate将在JDBC级别上自动禁用插入批处理。
从您的日志中可以看出:您只保存了一个记录,然后调用了flush(),因此每次刷新都只有一个追加的INSERT SQL需要处理。这就是为什么Hibernate无法帮助您批量插入,因为只有一个INSERT SQL需要处理。您应该在调用flush()之前,先保存一定数量的记录而不是每个save()调用都调用flush()
批量插入的最佳实践如下:
Session session = sessionFactory.openSession();
Transaction tx = session.beginTransaction();
for  ( int i=0; i<888888; i++ ) {
  TableA record = new TableA();
    record.setXXXX();
    session.save(record)
    if ( i % 50 == 0 ) { //50, same as the JDBC batch size
        //flush a batch of inserts and release memory:
        session.flush();
        session.clear();
    }
}
tx.commit();
session.close();

你需要分批保存和刷新记录。在每个批次结束时,应清除持久化上下文以释放一些内存,以防止内存耗尽,因为每个持久化对象都被放置在第一级缓存中(即JVM的内存)。你还可以禁用二级缓存以减少不必要的开销。

参考:


2
非常棒的回答。如果我能给你更多的赞,我一定会的。我已经解决了它,但还是谢谢! - Dominic Bou-Samra
2
我想知道为什么在增加循环索引“i”之前要清除会话。我猜你应该从for循环中删除i++,并将条件更改为++i % 50 == 0。(Hibernate文档中是否有错误?) - DylanYi
@dylanyi 这只是一个例子。也许索引 i 与扫描源数组有关(请参见 record.setXXX:它从哪里获取数据?)。当您处理数百万条记录时,在开始时刷新为 0 并没有太大的区别。 - usr-local-ΕΨΗΕΛΩΝ
据我所知,默认情况下,Hibernate仅生成一个INSERT语句,然后使用JDBC批处理功能将每个参数的值数组绑定到参数,而不是每个参数一个值。如果提供程序(例如PostgreSQL、MySQL、FireBird或SQLite3)不支持数组绑定,则通过将行作为单独的INSERT来模拟它。请注意,一些提供程序需要在URI中设置“rewriteBatchedStatements=true”以更改此默认行为并启用多重插入。 - Arnaud Bouchez
1
当使用@Transactional注解时,只有在方法成功完成时才会提交事务(触发flush)。因此,如果仅依赖于@Transactional,则在TX提交时只会刷新一次。我们想要的是手动分批刷新session。 - Ken Chan
显示剩余4条评论

7
如果你必须在大批量作业中使用Hibernate,StatelessSession是最好的选择。它将事物简化为最基本的对象到SQL语句映射,并在将行直接插入数据库时消除了所有未使用的ORM功能的开销。
如果您提供实际代码,那么我们可以更容易地提出建议 :)

我知道会这样。代码实际上是由 COBOL 生成的 naca Java 代码,因此非常难以调试和理解。仍然没有找到插入的位置,唉。 - Dominic Bou-Samra

3
11:02:56.133 [main] DEBUG o.h.transaction.JDBCTransaction - commit      

这意味着每次插入操作后数据库都会进行提交。请确保您不要在插入循环内提交事务/关闭会话。而是在最后一次操作时执行。


你应该每隔几千条记录刷新和清除你的会话。 - jtahlborn
@jtahlborn 是的,非常好的观点,否则Hibernate会耗尽内存。 - prunge

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接