我有一个问卷反馈的数据库表格(Oracle 11g),其中包括多选和多项选择题。选项列中包含用户可以选择的每个值,而答案列中则包含他们所选内容的数字值。
ID_NO OPTIONS ANSWERS
1001 Apple Pie|Banana-Split|Cream Tea 1|2
1002 Apple Pie|Banana-Split|Cream Tea 2|3
1003 Apple Pie|Banana-Split|Cream Tea 1|2|3
我需要一个查询来解码答案,将答案的文本版本作为单个字符串。
ID_NO ANSWERS ANSWER_DECODE
1001 1|2 Apple Pie|Banana-Split
1002 2|3 Banana-Split|Cream Tea
1003 1|2|3 Apple Pie|Banana-Split|Cream Tea
我曾经尝试使用正则表达式来替换值和获取子字符串,但是我无法想出一种正确合并两者的方法。
WITH feedback AS (
SELECT 1001 id_no, 'Apple Pie|Banana-Split|Cream Tea' options, '1|2' answers FROM DUAL UNION
SELECT 1002 id_no, 'Apple Pie|Banana-Split|Cream Tea' options, '2|3' answers FROM DUAL UNION
SELECT 1003 id_no, 'Apple Pie|Banana-Split|Cream Tea' options, '1|2|3' answers FROM DUAL )
SELECT
id_no,
options,
REGEXP_SUBSTR(options||'|', '(.)+?\|', 1, 2) second_option,
answers,
REGEXP_REPLACE(answers, '(\d)+', ' \1 ') answer_numbers,
REGEXP_REPLACE(answers, '(\d)+', REGEXP_SUBSTR(options||'|', '(.)+?\|', 1, To_Number('2'))) "???"
FROM feedback
我不希望在SQL中手动定义或解码答案;由于有很多调查问卷,提问方式不同(选项数量也不同),所以我希望有一种可以动态适用于所有问题的解决方案。
我尝试将选项和答案按LEVEL分成单独的行,并在代码匹配时重新连接它们,但是在实际数据集中运行速度非常慢(一个包含600行响应的5选项问题)。
WITH feedback AS (
SELECT 1001 id_no, 'Apple Pie|Banana-Split|Cream Tea' options, '1|2' answers FROM DUAL UNION
SELECT 1002 id_no, 'Apple Pie|Banana-Split|Cream Tea' options, '2|3' answers FROM DUAL UNION
SELECT 1003 id_no, 'Apple Pie|Banana-Split|Cream Tea' options, '1|2|3' answers FROM DUAL )
SELECT
answer_rows.id_no,
ListAgg(option_rows.answer) WITHIN GROUP(ORDER BY option_rows.lvl)
FROM
(SELECT DISTINCT
LEVEL lvl,
REGEXP_SUBSTR(options||'|', '(.)+?\|', 1, LEVEL) answer
FROM
(SELECT DISTINCT
options,
REGEXP_COUNT(options||'|', '(.)+?\|') num_choices
FROM
feedback)
CONNECT BY LEVEL <= num_choices
) option_rows
LEFT OUTER JOIN
(SELECT DISTINCT
id_no,
to_number(REGEXP_SUBSTR(answers, '(\d)+', 1, LEVEL)) answer
FROM
(SELECT DISTINCT
id_no,
answers,
To_Number(REGEXP_SUBSTR(answers, '(\d)+$')) max_answer
FROM
feedback)
WHERE
to_number(REGEXP_SUBSTR(answers, '(\d)+', 1, LEVEL)) IS NOT NULL
CONNECT BY LEVEL <= max_answer
) answer_rows
ON option_rows.lvl = answer_rows.answer
GROUP BY
answer_rows.id_no
ORDER BY
answer_rows.id_no
如果仅使用正则表达式无法解决问题,是否有比LEVEL更有效的方法来拆分值?或者是否有其他可行的方法?