使用MySQL在更改查找列的情况下查找最近或精确匹配值

3
我有以下数据集... 条件监测位置数据(CML表)
+--------+--------------------+--------------------+--------------+------------+--------------------+-------+-----+
| CML_ID |     POF_COLUMN     |      CML_TYPE      | SAMPLE_VALUE | COMPLIANCE | CORROSION_SEVERITY | LR_LD | POF |
+--------+--------------------+--------------------+--------------+------------+--------------------+-------+-----+
|      1 | SAMPLE_VALUE       | MIC_SAMPLING_POINT |            5 | NO         | MINOR              |     1 |     |
|      2 | SAMPLE_VALUE       | MIC_SAMPLING_POINT |          0.5 | NO         | MINOR              |     2 |     |
|      3 | SAMPLE_VALUE       | MIC_SAMPLING_POINT |           20 | NO         | MINOR              |     3 |     |
|      4 | COMPLIANCE         | VALVE_ROTATED      |            0 | YES        | MINOR              |     4 |     |
|      5 | LR_LD              | PIPING_THICKNESS   |            0 | YES        | MINOR              |   0.1 |     |
|      6 | CORROSION_SEVERITY | VESSEL_SHELL       |            0 | NO         | SEVERE             |     0 |     |
|      7 | CORROSION_SEVERITY | NOZZLE             |            0 | NO         | LOW                |     0 |     |
+--------+--------------------+--------------------+--------------+------------+--------------------+-------+-----+

故障概率查询数据(POF表)
+--------------------+--------------------+-------------+-----+
|     POF_COLUMN     |      CML_TYPE      | VALUE_RANGE | POF |
+--------------------+--------------------+-------------+-----+
| SAMPLE_VALUE       | MIC_SAMPLING_POINT | 1           |   5 |
| SAMPLE_VALUE       | MIC_SAMPLING_POINT | 5           |   4 |
| SAMPLE_VALUE       | MIC_SAMPLING_POINT | 10          |   3 |
| SAMPLE_VALUE       | MIC_SAMPLING_POINT | 15          |   2 |
| SAMPLE_VALUE       | MIC_SAMPLING_POINT | 100         |   1 |
| COMPLIANCE         | VALVE_ROTATED      | YES         |   5 |
| COMPLIANCE         | VALVE_ROTATED      | NO          |   1 |
| LR_LD              | PIPING_THICKNESS   | 2           |   5 |
| LR_LD              | PIPING_THICKNESS   | 1.5         |   4 |
| LR_LD              | PIPING_THICKNESS   | 1           |   3 |
| LR_LD              | PIPING_THICKNESS   | 0.8         |   2 |
| LR_LD              | PIPING_THICKNESS   | 0.5         |   1 |
| CORROSION_SEVERITY | VESSEL_SHELL       | NEGLIGIBLE  |   5 |
| CORROSION_SEVERITY | VESSEL_SHELL       | LOW         |   4 |
| CORROSION_SEVERITY | VESSEL_SHELL       | MEDIUM      |   3 |
| CORROSION_SEVERITY | VESSEL_SHELL       | HIGH        |   2 |
| CORROSION_SEVERITY | VESSEL_SHELL       | SEVERE      |   1 |
| CORROSION_SEVERITY | NOZZLE             | NEGLIGIBLE  |   5 |
| CORROSION_SEVERITY | NOZZLE             | LOW         |   5 |
| CORROSION_SEVERITY | NOZZLE             | MEDIUM      |   5 |
| CORROSION_SEVERITY | NOZZLE             | HIGH        |   3 |
| CORROSION_SEVERITY | NOZZLE             | SEVERE      |   2 |
+--------------------+--------------------+-------------+-----+

我需要使用CML表中的每个CML记录的[POF_Column]、[CML_Type]和[样本值]、[合规性]、[腐蚀严重程度]或[LR_LD]字段之一,返回POF表中最接近或匹配的记录,然后将该记录的POF更新到CML表中。
例如,如果我们看CML_ID = 2:
根据[POF_COLUMN] ='SAMPLE_VALUE',[CML_TYPE] = 'MIC_SAMPLING_POINT'并且[VALUE_RANGE]- [SAMPLE_VALUE]列中的值(在此示例中为0.5)是最小值来过滤POF表。
在这种情况下,它将匹配POF表中的第一条记录,并返回值POF = 5。
如果我们再看另一个案例,CML_ID = 7:
根据[POF_COLUMN] = 'CORROSION_SEVERITY',[CML_TYPE] = 'NOZZLE'并且[VALUE_RANGE] = [CORROSION_SEVERITY]列中的值,即'LOW'来过滤POF表。
在这种情况下,它将匹配POF表中的倒数第四行,并返回值POF = 5。
总之,我需要更新CML表格以显示以下结果...
+--------+---+-----+
| CML_ID | … | POF |
+--------+---+-----+
|      1 | … |   4 |
|      2 | … |   5 |
|      3 | … |   2 |
|      4 | … |   5 |
|      5 | … |   1 |
|      6 | … |   1 |
|      7 | … |   5 |
+--------+---+-----+

有人知道我如何实现这个吗?我在下面放了一些我的尝试样本代码。它可以查找精确匹配的值,但无法查找最接近的匹配值。
DROP DATABASE IF EXISTS VESSELS;
CREATE DATABASE VESSELS;
USE VESSELS;

CREATE TABLE CML (
  `CML_ID`              INTEGER NOT NULL,
  `POF_COLUMN`          VARCHAR(50),
  `CML_TYPE`            VARCHAR(50),
  `SAMPLE_VALUE`        FLOAT,
  `COMPLIANCE`          ENUM('YES','NO'),
  `CORROSION_SEVERITY`  VARCHAR(50),
  `LR_LD`               FLOAT,
  `POF`                 TINYINT,
  PRIMARY KEY (CML_ID)
);

CREATE TABLE POF (
  `POF_COLUMN`   VARCHAR(50),
  `CML_TYPE`     VARCHAR(50),
  `VALUE_RANGE`  VARCHAR(100),
  `POF`          TINYINT,
  PRIMARY KEY (POF_COLUMN, CML_TYPE, VALUE_RANGE)
);

INSERT INTO CML 
VALUES (1, 'SAMPLE_VALUE', 'MIC_SAMPLING_POINT', 5, 'NO', 'MINOR', 1, NULL),
       (2, 'SAMPLE_VALUE', 'MIC_SAMPLING_POINT', 0.5, 'NO', 'MINOR', 2, NULL),
       (3, 'SAMPLE_VALUE', 'MIC_SAMPLING_POINT', 20, 'NO', 'MINOR', 3, NULL), 
       (4, 'COMPLIANCE', 'VALVE_ROTATED', 0, 'YES', 'MINOR', 4, NULL), 
       (5, 'LR_LD', 'PIPING_THICKNESS', 0, 'YES', 'MINOR', 0.1, NULL),
       (6, 'CORROSION_SEVERITY', 'VESSEL_SHELL', 0, 'NO', 'SEVERE', 0, NULL),
       (7, 'CORROSION_SEVERITY', 'NOZZLE', 0, 'NO', 'LOW', 0, NULL);

INSERT INTO POF 
VALUES ('SAMPLE_VALUE', 'MIC_SAMPLING_POINT', 1, 5),
       ('SAMPLE_VALUE', 'MIC_SAMPLING_POINT', 5, 4),
       ('SAMPLE_VALUE', 'MIC_SAMPLING_POINT', 10, 3),
       ('SAMPLE_VALUE', 'MIC_SAMPLING_POINT', 15, 2),
       ('SAMPLE_VALUE', 'MIC_SAMPLING_POINT', 100, 1),
       ('COMPLIANCE', 'VALVE_ROTATED', 'YES', 5), 
       ('COMPLIANCE', 'VALVE_ROTATED', 'NO', 1), 
       ('LR_LD', 'PIPING_THICKNESS', 2, 5),
       ('LR_LD', 'PIPING_THICKNESS', 1.5, 4),
       ('LR_LD', 'PIPING_THICKNESS', 1, 3),
       ('LR_LD', 'PIPING_THICKNESS', 0.8, 2),
       ('LR_LD', 'PIPING_THICKNESS', 0.5, 1),
       ('CORROSION_SEVERITY', 'VESSEL_SHELL', 'NEGLIGIBLE', 5),
       ('CORROSION_SEVERITY', 'VESSEL_SHELL', 'LOW', 4),
       ('CORROSION_SEVERITY', 'VESSEL_SHELL', 'MEDIUM', 3),
       ('CORROSION_SEVERITY', 'VESSEL_SHELL', 'HIGH', 2),
       ('CORROSION_SEVERITY', 'VESSEL_SHELL', 'SEVERE', 1),
       ('CORROSION_SEVERITY', 'NOZZLE', 'NEGLIGIBLE', 5),
       ('CORROSION_SEVERITY', 'NOZZLE', 'LOW', 5),
       ('CORROSION_SEVERITY', 'NOZZLE', 'MEDIUM', 5),
       ('CORROSION_SEVERITY', 'NOZZLE', 'HIGH', 3),
       ('CORROSION_SEVERITY', 'NOZZLE', 'SEVERE', 2);

-- UPDATE EXACT MATCHING VALUE
UPDATE CML c
LEFT JOIN POF p
ON c.POF_COLUMN = p.POF_COLUMN
   AND c.CML_TYPE = p.CML_TYPE
   AND ( (c.POF_COLUMN = 'COMPLIANCE' AND c.COMPLIANCE = p.VALUE_RANGE) OR
         (c.POF_COLUMN = 'SAMPLE_VALUE' AND c.SAMPLE_VALUE = p.VALUE_RANGE) OR
         (c.POF_COLUMN = 'LR_LD' AND c.LR_LD = p.VALUE_RANGE) OR
         (c.POF_COLUMN = 'CORROSION_SEVERITY' AND c.CORROSION_SEVERITY = p.VALUE_RANGE)
       )
SET c.POF = p.POF;

SELECT * FROM CML;
2个回答

1
很好的问题陈述,详细说明了问题。
你面临一个挑战。
A)POF中的值可以是字符串或浮点数。当它是字符串时,需要进行精确比较。如果它是浮点数,则要找到最接近的值。
这似乎是应用服务器中的业务代码,但假设您想在MySQL中执行此操作。
答案是一个CASE语句,通过CML_TYPE确定如何计算POF。对于“字符串”类型比较,这将是一个等号。对于“浮点数”类型比较,您可以编写比较来获取最接近提供的值的记录。无论哪种方式,每个规则都略有不同。
您需要为每个CML_TYPE创建一个CASE语句,然后创建一个自定义匹配器来查找所需的POF。
以下代码实际上可以工作,但性能不能保证。
UPDATE CML c
  JOIN 
(select CML_ID, 
  CASE CML_TYPE 
     WHEN 'VALVE_ROTATED' THEN
       (select POF from POF where POF.CML_TYPE = CML.CML_TYPE and VALUE_RANGE = CML.COMPLIANCE)
     WHEN 'VESSEL_SHELL' THEN
       (select POF from POF where POF.CML_TYPE = CML.CML_TYPE and VALUE_RANGE = CML.CORROSION_SEVERITY)
     WHEN 'NOZZLE' THEN
       (select POF from POF where POF.CML_TYPE = CML.CML_TYPE and VALUE_RANGE = CML.CORROSION_SEVERITY)
     WHEN 'MIC_SAMPLING_POINT' THEN
       (select POF from POF where POF.CML_TYPE = CML.CML_TYPE ORDER BY ABS(CML.SAMPLE_VALUE - cast(VALUE_RANGE AS DECIMAL(10,2))) LIMIT 1)
     WHEN 'PIPING_THICKNESS' THEN
       (select POF from POF where POF.CML_TYPE = CML.CML_TYPE ORDER BY ABS(CML.SAMPLE_VALUE - cast(VALUE_RANGE AS DECIMAL(10,2))) LIMIT 1)
     ELSE 'BLAH'       
  END as CALC_POF
from CML) as updater on c.CML_ID = updater.CML_ID
set c.POF = updater.CALC_POF;

Link TO SQL Fiddle

这句话的意思是保留HTML格式,其中包含一个链接到SQL Fiddle页面的链接。

@Brain 感谢你的回答。正如你所说,这确实可行,但我不想在 UPDATE 语句中使用相关子查询。我设法想出了一个解决方案(请参见下面对 annoyXmous 回答的评论),它可以满足我的需求。感谢你的回答,它给了我一些有用的提示,帮助我得出最终的解决方案! - Josh

1

这篇文章非常详细,包含样本数据和预期结果,点赞支持。问题在于 sample_value 和 lr_ld 可能不是来自 POF 表的确切值。然而,您会注意到这些值要么相等,要么小于该值范围。

因此,如果我们得到了 POF 值的最大值,其中 sample_value 或 lr_ld 小于或等于值范围,那么我们只需要获取 POF 值的最大值即可。

仅当当 sample_value 或 lr_ld 增加时,POF 值才会增加,因此此查询才有效。

UPDATE CML c
  JOIN 
(
    select c.CML_ID, max(p.POF) POF 
    from CML c
    LEFT JOIN POF p
    ON c.POF_COLUMN = p.POF_COLUMN
       AND c.CML_TYPE = p.CML_TYPE
       AND ( (c.POF_COLUMN = 'COMPLIANCE' AND c.COMPLIANCE = p.VALUE_RANGE) OR
             (c.POF_COLUMN = 'SAMPLE_VALUE' AND  c.SAMPLE_VALUE<=p.VALUE_RANGE) OR
             (c.POF_COLUMN = 'LR_LD' AND c.LR_LD  <= p.VALUE_RANGE) OR
             (c.POF_COLUMN = 'CORROSION_SEVERITY' AND c.CORROSION_SEVERITY = p.VALUE_RANGE)
           )
     group by c.CML_ID 
   ) t
 on c.CML_ID = t.CML_ID
set c.POF = t.POF;

谢谢,这给了我提示,让我得以得出最终结果,该结果与我的记录匹配,无论是在POF表中的主键(POF_COLUMN、CML_TYPE和VALUE_RANGE)上进行精确匹配还是最接近匹配。请查看我在提供的fiddle链接中的最终解决方案。http://sqlfiddle.com/#!9/0eeb9b/1 - Josh
你能检查一下 cml id = 3 吗?它应该是 1 对吧? - jose_bacoy
不,应该明确为2。SAMPLE_VALUE = 20,在POF表中最接近的匹配值为15。因此,POF为2。 - Josh

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接