我希望加速将数据加载到PostgreSQL中。我开始使用pgloader https://github.com/dimitri/pgloader 并尝试利用并行加载。我试着调整不同的参数,但是我无法在我的机器上启动超过两个核心(该机器有32个核心)。我找到了文档https://github.com/dimitri/pgloader/blob/master/pgloader.1.md并尝试设置批处理选项,这些选项在文档中有描述。目前,我的设置如下:
LOAD CSV
FROM '/home/data1_1.csv'
--FROM 'data/data.csv'
INTO postgresql://:postgres@localhost:5432/test?test
WITH truncate,
skip header = 0,
fields optionally enclosed by '"',
fields escaped by double-quote,
fields terminated by ',',
batch rows = 100,
batch size = 1MB,
batch concurrency = 64
SET client_encoding to 'utf-8',
work_mem to '10000MB',
maintenance_work_mem to '20000 MB'