Postgres哈希连接的速度

7
我有三个表,希望在Postgres 9.1中使用内连接进行连接,它们分别是reads、devices和device_patients。下面是每个表的简略模式。
reads - 约250,000行
CREATE TABLE reads
  (
    id serial NOT NULL,
    device_id integer NOT NULL,
    value bigint NOT NULL,
    read_datetime timestamp without time zone NOT NULL,
    created_at timestamp without time zone NOT NULL,
    updated_at timestamp without time zone NOT NULL,
    CONSTRAINT reads_pkey PRIMARY KEY (id )
  )
WITH (
  OIDS=FALSE
);
ALTER TABLE reads
  OWNER TO postgres;

CREATE INDEX index_reads_on_device_id
  ON reads
  USING btree
  (device_id );

CREATE INDEX index_reads_on_read_datetime
  ON reads
  USING btree
  (read_datetime );

设备 -- 约500行
CREATE TABLE devices
(
  id serial NOT NULL,
  serial_number character varying(20) NOT NULL,
  created_at timestamp without time zone NOT NULL,
  updated_at timestamp without time zone NOT NULL,
  CONSTRAINT devices_pkey PRIMARY KEY (id )
)
WITH (
  OIDS=FALSE
);
ALTER TABLE devices
  OWNER TO postgres;

CREATE UNIQUE INDEX index_devices_on_serial_number
  ON devices
  USING btree
  (serial_number COLLATE pg_catalog."default" );

患者设备 -- 大约25,000行
CREATE TABLE patient_devices
(
  id serial NOT NULL,
  patient_id integer NOT NULL,
  device_id integer NOT NULL,
  issuance_datetime timestamp without time zone NOT NULL,
  unassignment_datetime timestamp without time zone,
  created_at timestamp without time zone NOT NULL,
  updated_at timestamp without time zone NOT NULL,
  CONSTRAINT patient_devices_pkey PRIMARY KEY (id )
)
WITH (
  OIDS=FALSE
);
ALTER TABLE patient_devices
  OWNER TO postgres;

CREATE INDEX index_patient_devices_on_device_id
  ON patient_devices
  USING btree
  (device_id );

CREATE INDEX index_patient_devices_on_issuance_datetime
  ON patient_devices
  USING btree
  (issuance_datetime );

CREATE INDEX index_patient_devices_on_patient_id
  ON patient_devices
  USING btree
  (patient_id );

CREATE INDEX index_patient_devices_on_unassignment_datetime
  ON patient_devices
  USING btree
  (unassignment_datetime );

病人--约1000行
CREATE TABLE patients
(
  id serial NOT NULL,
  first_name character varying(50) NOT NULL,
  middle_name character varying(50),
  last_name character varying(50) NOT NULL,
  created_at timestamp without time zone NOT NULL,
  updated_at timestamp without time zone NOT NULL,
  CONSTRAINT participants_pkey PRIMARY KEY (id )
)
WITH (
  OIDS=FALSE
);
ALTER TABLE patients
  OWNER TO postgres;

这是我的简略查询。
SELECT device_patients.patient_id, serial_number FROM reads
INNER JOIN devices ON devices.id = reads.device_id
INNER JOIN patient_devices ON device_patients.device_id = devices.id
WHERE (reads.read_datetime BETWEEN '2012-01-01 10:30:01.000000' AND '2013-05-18 03:03:42')
AND (read_datetime > issuance_datetime) AND ((unassignment_datetime IS NOT NULL AND read_datetime < unassignment_datetime) OR
(unassignment_datetime IS NULL))
GROUP BY serial_number, patient_devices.patient_id LIMIT 10

最终,这将是更大查询的一小部分(没有LIMIT,我只添加了limit来证明长运行时间不是由于返回一堆行),但是我已经进行了一些实验,并确定这是更大查询的缓慢部分。当我在此查询上运行EXPLAIN ANALYZE时,我会得到以下输出(也可在这里查看)。
Limit  (cost=156442.31..156442.41 rows=10 width=13) (actual time=2815.435..2815.441 rows=10 loops=1)
  ->  HashAggregate  (cost=156442.31..159114.89 rows=267258 width=13) (actual time=2815.432..2815.437 rows=10 loops=1)
        ->  Hash Join  (cost=1157.78..151455.79 rows=997304 width=13) (actual time=30.930..2739.164 rows=250150 loops=1)
              Hash Cond: (devices.device_id = devices.id)
              Join Filter: ((reads.read_datetime > patient_devices.issuance_datetime) AND (((patient_devices.unassignment_datetime IS NOT NULL) AND (reads.read_datetime < patient_devices.unassignment_datetime)) OR (patient_devices.unassignment_datetime IS NULL)))
              ->  Seq Scan on reads  (cost=0.00..7236.94 rows=255396 width=12) (actual time=0.035..64.433 rows=255450 loops=1)
                    Filter: ((read_datetime >= '2012-01-01 10:30:01'::timestamp without time zone) AND (read_datetime <= '2013-05-18 03:03:42'::timestamp without time zone))
              ->  Hash  (cost=900.78..900.78 rows=20560 width=37) (actual time=30.830..30.830 rows=25015 loops=1)
                    Buckets: 4096  Batches: 1  Memory Usage: 1755kB
                    ->  Hash Join  (cost=19.90..900.78 rows=20560 width=37) (actual time=0.776..20.551 rows=25015 loops=1)
                          Hash Cond: (patient_devices.device_id = devices.id)
                          ->  Seq Scan on patient_devices  (cost=0.00..581.93 rows=24893 width=24) (actual time=0.014..7.867 rows=25545 loops=1)
                                Filter: ((unassignment_datetime IS NOT NULL) OR (unassignment_datetime IS NULL))
                          ->  Hash  (cost=13.61..13.61 rows=503 width=13) (actual time=0.737..0.737 rows=503 loops=1)
                                Buckets: 1024  Batches: 1  Memory Usage: 24kB
                                ->  Seq Scan on devices  (cost=0.00..13.61 rows=503 width=13) (actual time=0.016..0.466 rows=503 loops=1)
                                      Filter: (entity_id = 2)
Total runtime: 2820.392 ms

我的问题是如何加快这个速度?目前我正在我的Windows电脑上进行测试,但最终它将部署在Ubuntu上,这会有什么不同吗?非常感谢任何关于为什么需要2秒的见解。
谢谢
有人建议LIMIT可能会改变查询计划。此处是没有LIMIT的相同查询。慢的部分仍然是Hash Join。
此外,以下是相关的调整参数。我现在只在Windows上进行测试,不知道这对Linux机器有什么影响。
shared_buffers = 2GB effective_cache_size = 4GB work_mem = 256MB random_page_cost = 2.0
这是reads表的统计信息。
Statistic   Value
Sequential Scans    130
Sequential Tuples Read  28865850
Index Scans 283630
Index Tuples Fetched    141421907
Tuples Inserted 255450
Tuples Updated  0
Tuples Deleted  0
Tuples HOT Updated  0
Live Tuples 255450
Dead Tuples 0
Heap Blocks Read    20441
Heap Blocks Hit 3493033
Index Blocks Read   8824
Index Blocks Hit    4840210
Toast Blocks Read   
Toast Blocks Hit    
Toast Index Blocks Read 
Toast Index Blocks Hit  
Last Vacuum 2013-05-20 09:23:03.782-07
Last Autovacuum 
Last Analyze    2013-05-20 09:23:03.91-07
Last Autoanalyze    2013-05-17 19:01:44.075-07
Vacuum counter  1
Autovacuum counter  0
Analyze counter 1
Autoanalyze counter 6
Table Size  27 MB
Toast Table Size    none
Indexes Size    34 MB

这是设备表的统计数据。
Statistic   Value
Sequential Scans    119
Sequential Tuples Read  63336
Index Scans 1053935
Index Tuples Fetched    1053693
Tuples Inserted 609
Tuples Updated  0
Tuples Deleted  0
Tuples HOT Updated  0
Live Tuples 609
Dead Tuples 0
Heap Blocks Read    32
Heap Blocks Hit 1054553
Index Blocks Read   32
Index Blocks Hit    2114305
Toast Blocks Read   
Toast Blocks Hit    
Toast Index Blocks Read 
Toast Index Blocks Hit  
Last Vacuum 
Last Autovacuum 
Last Analyze    
Last Autoanalyze    2013-05-17 19:02:49.692-07
Vacuum counter  0
Autovacuum counter  0
Analyze counter 0
Autoanalyze counter 2
Table Size  48 kB
Toast Table Size    none
Indexes Size    128 kB

这是病人设备表的统计数据。
Statistic   Value
Sequential Scans    137
Sequential Tuples Read  3065400
Index Scans 853990
Index Tuples Fetched    46143763
Tuples Inserted 25545
Tuples Updated  24936
Tuples Deleted  0
Tuples HOT Updated  0
Live Tuples 25547
Dead Tuples 929
Heap Blocks Read    1959
Heap Blocks Hit 6099617
Index Blocks Read   1077
Index Blocks Hit    2462681
Toast Blocks Read   
Toast Blocks Hit    
Toast Index Blocks Read 
Toast Index Blocks Hit  
Last Vacuum 
Last Autovacuum 2013-05-17 19:01:44.576-07
Last Analyze    
Last Autoanalyze    2013-05-17 19:01:44.697-07
Vacuum counter  0
Autovacuum counter  6
Analyze counter 0
Autoanalyze counter 6
Table Size  2624 kB
Toast Table Size    none
Indexes Size    5312 kB

以下是我试图加速的完整查询。较小的查询确实更快,但我无法让我的完整查询更快,如下所示。建议我添加了4个新索引:UNIQUE(device_id, issuance_datetime),UNIQUE(device_id, issuance_datetime),UNIQUE(patient_id, unassignment_datetime),UNIQUE(patient_id, unassignment_datetime)。
SELECT 
first_name
, last_name
, MAX(max_read) AS read_datetime
, SUM(value) AS value
, serial_number
FROM (
    SELECT
    pa.first_name
    , pa.last_name
    , value
    , first_value(de.serial_number) OVER(PARTITION BY pa.id ORDER BY re.read_datetime DESC) AS serial_number -- I'm not sure if this is a good way to do this, but I don't know of another way
    , re.read_datetime
    , MAX(re.read_datetime) OVER (PARTITION BY pd.id) AS max_read
    FROM reads re
    INNER JOIN devices de ON de.id = re.device_id
    INNER JOIN patient_devices pd ON pd.device_id = de.id
        AND re.read_datetime >= pd.issuance_datetime
        AND re.read_datetime < COALESCE(pd.unassignment_datetime , 'infinity'::timestamp)
    INNER JOIN patients pa ON pa.id = pd.patient_id
    WHERE re.read_datetime BETWEEN '2012-01-01 10:30:01' AND '2013-05-18 03:03:42'
) AS foo WHERE read_datetime = max_read
GROUP BY first_name, last_name, serial_number ORDER BY value desc
LIMIT 10

抱歉没有早点发布这个问题,但我认为这个查询可能太复杂了,并且试图简化问题,但显然我仍然无法解决它。如果我能使用max_read变量限制嵌套选择返回的结果,似乎会快得多,但根据许多来源的说法,在Postgres中不允许这样做。

2
请添加您表格的实际DDL,其中包括键和索引。 - wildplasser
我刚刚更新了我的查询,以显示我拥有的所有索引的真实创建脚本。 - Jon
1
统计数据,例如通过 VACUUM ANALYZE <tablename>; 维护的数据。还有相关的调优参数 { shared_buffers, effective_cache_size, work_mem, random_page_cost}。 - wildplasser
2
work_mem = 256MB random_page_cost = 2.0。尝试将work_mem设置为合理值(较低),并/或将random_page_cost设置为较低的值(约1.2)。如果您有可用的索引(和有效的统计信息),它们将被使用。除非查询需要所有行(我担心它确实需要),否则不会这样做。 - wildplasser
当我将work_mem设置为5MB,random_page_cost设置为1.2时,查询现在需要约3.1秒,哈希连接仍然非常慢。 - Jon
显示剩余10条评论
2个回答

6

提醒:经过过滤的查询:

SELECT pd.patient_id
        , de.serial_number
FROM reads re
INNER JOIN devices de ON de.id = re.device_id
INNER JOIN patient_devices pd ON pd.device_id = de.id
        AND re.read_datetime >= pd.issuance_datetime -- changed this from '>' to '>='
        AND (re.read_datetime < pd.unissuance_datetime OR pd.unissuance_datetime IS NULL)
WHERE re.read_datetime BETWEEN '2012-01-01 10:30:01.000000' AND '2013-05-18 03:03:42'
GROUP BY de.serial_number, pd.patient_id 
LIMIT 10
   ;

更新:没有原始的错别字:

EXPLAIN ANALYZE
SELECT pd.patient_id
        , de.serial_number
FROM reads re
INNER JOIN devices de ON de.id = re.device_id
INNER JOIN patient_devices pd ON pd.device_id = de.id
        AND re.read_datetime >= pd.issuance_datetime
        AND (re.read_datetime < pd.unassignment_datetime OR pd.unassignment_datetime IS NULL)
WHERE re.read_datetime BETWEEN '2012-01-01 10:30:01.000000' AND '2013-05-18 03:03:42'
GROUP BY de.serial_number, pd.patient_id
LIMIT 10
  ;

更新:这里的速度快了约6倍(在合成数据上,并且稍微修改了数据模型)

-- Modified data model + synthetic data:
CREATE TABLE devices
( id serial NOT NULL
, serial_number character varying(20) NOT NULL
-- , created_at timestamp without time zone NOT NULL
-- , updated_at timestamp without time zone NOT NULL
, CONSTRAINT devices_pkey PRIMARY KEY (id )
, UNIQUE (serial_number)
) ;

CREATE TABLE reads
  -- ( id serial NOT NULL PRIMARY KEY -- You don't need this surrogate key
  (  device_id integer NOT NULL REFERENCES devices (id)
  ,  value bigint NOT NULL
  ,  read_datetime timestamp without time zone NOT NULL
  -- ,  created_at timestamp without time zone NOT NULL
  -- ,  updated_at timestamp without time zone NOT NULL
        , PRIMARY KEY ( device_id, read_datetime)
  ) ;



CREATE TABLE patient_devices
-- ( id serial NOT NULL PRIMARY KEY -- You don't need this surrogate key
( patient_id integer NOT NULL -- REFERENCES patients (id)
, device_id integer NOT NULL REFERENCES devices(id)
, issuance_datetime timestamp without time zone NOT NULL
, unassignment_datetime timestamp without time zone
-- , created_at timestamp without time zone NOT NULL
-- , updated_at timestamp without time zone NOT NULL
, PRIMARY KEY (device_id, issuance_datetime)
, UNIQUE (device_id, unassignment_datetime)
) ;


-- CREATE INDEX index_patient_devices_on_issuance_datetime ON patient_devices (device_id, unassignment_datetime );
-- may need some additional indices later

-- devices -- ~500 rows
INSERT INTO devices(serial_number) SELECT 'No_' || gs::text FROM generate_series(1,500) gs;

-- reads -- ~100K rows
INSERT INTO reads(device_id, read_datetime, value)
SELECT  de.id, gs
        , (random()*1000000)::bigint
FROM devices  de
JOIN generate_series('2012-01-01', '2013-05-01' , '1 hour' ::interval) gs
        ON random() < 0.02;

-- patient_devices -- ~25,000 rows
INSERT INTO patient_devices(device_id, issuance_datetime, patient_id)
SELECT DISTINCT ON (re.device_id, read_datetime)
        re.device_id, read_datetime, pa
FROM generate_series(1,100) pa
JOIN reads re
        ON random() < 0.01;

        -- close the open intervals
UPDATE patient_devices dst
SET unassignment_datetime = src.issuance_datetime
FROM patient_devices src
WHERE src.device_id = dst.device_id
AND src.issuance_datetime > dst.issuance_datetime
AND NOT EXISTS ( SELECT *
        FROM patient_devices nx
        WHERE nx.device_id = src.device_id
        AND nx.issuance_datetime > dst.issuance_datetime
        AND nx.issuance_datetime < src.issuance_datetime
        )
        ;

VACUUM ANALYZE patient_devices;
VACUUM ANALYZE devices;
VACUUM ANALYZE reads;


-- EXPLAIN ANALYZE
SELECT pd.patient_id
        , de.serial_number
        --, COUNT (*) AS zcount
FROM reads re
INNER JOIN devices de ON de.id = re.device_id
INNER JOIN patient_devices pd ON pd.device_id = de.id
        AND re.read_datetime >= pd.issuance_datetime
        AND re.read_datetime < COALESCE(pd.unassignment_datetime , 'infinity'::timestamp)
WHERE re.read_datetime BETWEEN '2012-01-01 10:30:01' AND '2013-05-18 03:03:42'
GROUP BY de.serial_number, pd.patient_id
LIMIT 10
        ;

1
这会对我有什么帮助? - Jon
1
这可能对你自己没有帮助,但对那些试图帮助你的人可能有所帮助(并且他们很可能会从完全相同的重写开始)。 - wildplasser
1
好的,希望它能够正常运行 ;) - Jon
1
我已更新我的问题,包括完整查询。你的建议确实加快了我发布的原始查询。我试图将你的建议纳入完整的查询中,但却没有看到速度提升。 - Jon
4
我的大部分回答并不是关于查询本身的,而是关于数据模型的:我删除了无用(在我看来)的代理键,并添加了合理的(在我看来)外键约束。底层问题仍然是一个三维的三角形模型:{设备、患者、日期时间}是你的交叉表 patient_devices 上的二/三维主键。这可能表示BCNF违规,但这取决于数据的实际内容。 - wildplasser
我添加了一个额外的索引,发现我之前忘记了,现在完整的查询时间缩短到了约1.2秒。老实说,我不是DBA,现在正在学习很多关于数据库的知识。我想我需要更多地了解BCNF(因为我从未听说过)。我原本以为使用中间表来处理多对多关系是一种常见做法。如果这不是行业标准,你有什么建议? - Jon

0

查看 analyze 报告中出现 Seq Scan 的部分。

例如,这些部分可能需要使用一些索引:

 Seq Scan on patient_devices - > unassignment_datetime
 Seq Scan on devices -> entity_id
 Seq Scan on reads - > read_datetime

关于read_datetime:可以为数学方程式(如><=)创建特定的索引,这将非常方便。不过我不知道它的语法。

这些也引起了我的注意,我为它们都建立了B树索引,所以我不确定还能做什么。然而,似乎所有的时间都花在了哈希连接上。 - Jon
另外,我不确定你是指哪个索引函数。阅读文档后,我认为B树是我想要的索引类型,但如果这并不正确,我很乐意知道我真正需要的类型。 - Jon

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接