SQL:如何通过CASE优化INNER JOIN的性能

3

我有一个关于查询的性能问题,需要读取一个过滤表来获取筛选器,并读取与这些筛选器相关的源数据表。 因此,我的筛选器表看起来像这样;

FILTER_ID CUSTOMER COUNTRY DEPARTMENT
   1                  UK      DEP1
   2       CUS1       US      DEP1
   3       CUS1  

这是我的源数据表格:

ROW_NO CUSTOMER COUNTRY DEPARTMENT
  1      CUS1     UK      DEP1
  2      CUS2     UK      DEP1
  3      CUS3     UK      DEP1
  4      CUS1     US      DEP1
  5      CUS1     SG      DEP3
  6      CUS1     UK      DEP3

对于筛选器表中的每个筛选器,我需要从源数据表中获取行。但是,如果FILTER表中的列为空,我们需要读取源数据表中存在的该列的所有成员。假设对于FILTER_ID 1,我们需要从源表中读取所有COUNTRY = UK和DEPARTMENT = DEP1的客户。
结果表应如下所示;
FILTER_ID   ROW_NO   CUSTOMER   COUNTRY   DEPARTMENT
   1           1       CUS1       UK        DEP1
   1           2       CUS2       UK        DEP1
   1           3       CUS3       UK        DEP1
   2           4       CUS1       US        DEP1
   3           1       CUS1       UK        DEP1
   3           4       CUS1       US        DEP1
   3           5       CUS1       SG        DEP3
   3           6       CUS1       UK        DEP3

我正在使用条件连接,它正常工作,但问题是,速度非常慢!

select t1.FILTER_ID, t2.* from FILTER_TABLE as t1
inner join SOURCE_DATA as t2 on 
CASE WHEN t1.CUSTOMER    = '' THEN t2.CUSTOMER   ELSE t1.CUSTOMER END = t2.CUSTOMER and
CASE WHEN t1.DEPARTMENT  = '' THEN t2.DEPARTMENT ELSE t1.DEPARTMENT END = t2.DEPARTMENT and
CASE WHEN t1.COUNTRY     = '' THEN t2.COUNTRY    ELSE t1.COUNTRY END = t2.COUNTRY 

有没有一种方法可以优化这段代码?

2
您可以将CASE语句重写为OR的组合(因为t2.CUSTOMER = t2.CUSTOMER几乎每次都可以通过(除非t2.CUSTOMERNULL)),例如:((t1.CUSTOMER = '') OR (t1.CUSTORMER <> '' AND t1.CUSTOMER = t2.CUSTOMER))。请检查并告诉我们结果。 - Rokuto
3
你能否将FILTER_TABLE拆分成9个表,这些表只包含“非空”列?然后编写9个SELECT语句,在它们之间使用union all连接,每个SELECT语句仅在固定列上进行连接,不使用任何CASE语句? - sepupic
@Rokuto,用这种方式运行速度快了2倍。原来需要30分钟以上,现在只需要14分钟就能完成。谢谢! - bilen
应用函数到字段会阻止优化器使用索引。这就是为什么从CASE更改为OR会导致改进的原因。但执行计划是什么?这些字段是否被索引覆盖?服务器是否使用查找或表扫描? - Panagiotis Kanavos
1
@AlbertoMartinez,尝试创建一个带有“动态”参数的大型“智能”查询是一个非常常见的错误。结果总是令人失望的,因为这些“智能”查询会导致低效的执行计划和/或无法使用索引。 - Panagiotis Kanavos
显示剩余6条评论
2个回答

0

试试这样做,我认为会更快:

select t1.FILTER_ID, t2.* from FILTER_TABLE as t1
inner join SOURCE_DATA as t2 on 
 t1.CUSTOMER    = t2.CUSTOMER and
 t1.DEPARTMENT  = t2.DEPARTMENT and
 t1.COUNTRY     = t2.COUNTRY 
where t1.CUSTOMER <> '' and t1.DEPARTMENT <>'' and t1.COUNTRY <> ''
union all
select t1.FILTER_ID, t2.* from FILTER_TABLE as t1
inner join SOURCE_DATA as t2 on 
CASE WHEN t2.CUSTOMER    = t2.CUSTOMER 
where t1.CUSTOMER = '' and t1.DEPARTMENT <>'' and t1.COUNTRY <> ''
union all
select t1.FILTER_ID, t2.* from FILTER_TABLE as t1
inner join SOURCE_DATA as t2 on 
CASE WHEN t2.DEPARTMENT    = t2.DEPARTMENT 
where t1.CUSTOMER <> '' and t1.DEPARTMENT = '' and t1.COUNTRY <> ''
union all
select t1.FILTER_ID, t2.* from FILTER_TABLE as t1
inner join SOURCE_DATA as t2 on 
CASE WHEN t2.COUNTRY    = t2.COUNTRY 
where t1.CUSTOMER <> '' and t1.DEPARTMENT <> '' and t1.COUNTRY = ''

谢谢您的回答,但您的解决方案与上面的类似。同时,不同的列可以为空。比如说,CUSTOMER=' '和DEPARTMENT=' '。 我在生产服务器上有12个列,如果我尝试使用所有组合,那么组合将会非常复杂。 - bilen

0

尝试像这样:

select t1.FILTER_ID, t2.* from FILTER_TABLE as t1
inner join SOURCE_DATA as t2 on 
 IIF(t1.CUSTOMER = '',t2.CUSTOMER, t1.CUSTOMER)  = t2.CUSTOMER and
 IIF(t1.DEPARTMENT = '', t2.DEPARTMENT,t1.DEPARTMENT) = t2.DEPARTMENT  and
 IIF(t1.COUNTRY = '',t1.COUNTRY,t2.COUNTRY) = t2.COUNTRY 

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接