获取前10个未使用的手动序列号

4
我希望找到一个范围内前10个未使用的手动序列号。以下是我的查询:
select X1.* From
    (Select Rownum seq_number From Dual Connect By Rownum <= 
         (Select LPAD(9,(UTC.DATA_PRECISION - UTC.DATA_SCALE),9) 
          From User_Tab_Columns UTC 
          where UTC.Table_Name = 'Table_Name' And UTC.Column_Name = 'seq_number')) X1,
Table_Name X2
Where X1.seq_number = X2.seq_number (+)
  And X2.Rowid is Null
  And Rownum <= 10

尽管它提供了所需的输出,但我担心由此查询引起的负载[如果有的话],因为我们将每天多次使用此查询。
请指导是否有优化此查询的方法。
附加信息: 在表Table_Name T2上,定义了一个(seq_number)的唯一索引。
工作示例:
create table TEMP_TABLE_NAME ( seq_number number(6))

insert into TEMP_TABLE_NAME 
select distinct trunc(dbms_random.VALUE(1,5000)) seq_number 
from dual
connect by rownum <= 1000


create unique index TEMP_TABLE_NAME_IDX on TEMP_TABLE_NAME(seq_Number)


SELECT T1.*
  FROM (    SELECT ROWNUM seq_number
              FROM DUAL
        CONNECT BY ROWNUM <=
                      (SELECT LPAD (9,(UTC.DATA_PRECISION - UTC.DATA_SCALE),9)
                         FROM User_Tab_Columns UTC
                        WHERE     UTC.Table_Name = 'TEMP_TABLE_NAME'
                              AND UTC.Column_Name = 'SEQ_NUMBER')) T1,
       TEMP_TABLE_NAME T2
 WHERE     T1.seq_number = T2.seq_number(+)
       AND T2.ROWID IS NULL
       AND ROWNUM <= 10

对我而言,我的查询结果如下。表中随机生成的数字包括7和8,因此它们被忽略了。重点是获取前10个未使用的数字。

1
2
3
4
5
6
9
10
11
12

1
你能展示一下这个数据的样本和期望结果吗?我正在尝试在一个只有10行的测试表上运行你的查询,但是即使对于10行,我仍然得到了“ORA-30009: Not enough memory for CONNECT BY operation”错误。 - krokodilko
@krokodilko 你好,请检查一下,我已经编辑了我的帖子,并提供了一个可行的示例来复制完全相同的情况。 - pOrinG
你使用的是哪个版本的Oracle? - trincot
@trincot Oracle数据库11g企业版发布11.2.0.4.0 - 64位生产版本。 - pOrinG
你的表有多大?number(6)表示最多只能有999999行,即不超过1百万。 - krokodilko
@krokodilko 实际上,这个真实的表并不是很大,只是在设计表时我们使用了number(6)来表示唯一序列字段。在任何给定时间,真实的表最多只会有5000条记录。 - pOrinG
1个回答

2

首先,我会替换这个复杂的子查询:

Select Rownum seq_number From Dual Connect By Rownum <= 
         (Select LPAD(9,(UTC.DATA_PRECISION - UTC.DATA_SCALE),9) 
          From User_Tab_Columns UTC 
          where UTC.Table_Name = 'Table_Name' And UTC.Column_Name = 'seq_number')

使用这个:
Select Rownum As seq_number From Dual 
Connect By Rownum <= (Select max( seq_number ) + 10 From TEMP_TABLE_NAME ) 

甚至可以使用一个简单的常量:

Select Rownum As seq_number From Dual Connect By Rownum <= 1000000

您的子查询在一个非常基本的情况下实际上是无效的:
create table TEMP_TABLE_NAME(
  seq_number NUMBER
);

SELECT LPAD (9,(UTC.DATA_PRECISION - UTC.DATA_SCALE),9) as x , 
       UTC.DATA_PRECISION, UTC.DATA_SCALE, UTC.COLUMN_NAME
FROM User_Tab_Columns UTC
WHERE     UTC.Table_Name = 'TEMP_TABLE_NAME'
  AND UTC.Column_Name = 'SEQ_NUMBER'
;

X        DATA_PRECISION DATA_SCALE COLUMN_NAME
-------- -------------- ---------- -----------
  (null)         (null)     (null) SEQ_NUMBER

第二个案例:


create table TEMP_TABLE_NAME(
  seq_number NUMBER(15,0)
);

在这种情况下,子查询试图生成999999999999999行,很快就会导致内存错误。
SELECT count(*) FROM (
 SELECT ROWNUM seq_number
              FROM DUAL
        CONNECT BY ROWNUM <=
                      (SELECT LPAD (9,(UTC.DATA_PRECISION - UTC.DATA_SCALE),9)
                         FROM User_Tab_Columns UTC
                        WHERE     UTC.Table_Name = 'TEMP_TABLE_NAME'
                              AND UTC.Column_Name = 'SEQ_NUMBER')
);

ORA-30009: Not enough memory for CONNECT BY operation
30009. 0000 -  "Not enough memory for %s operation"
*Cause:    The memory size was not sufficient to process all the levels of the
           hierarchy specified by the query.
*Action:   In WORKAREA_SIZE_POLICY=AUTO mode, set PGA_AGGREGATE_TARGET to
           a reasonably larger value.
           Or, in WORKAREA_SIZE_POLICY=MANUAL mode, set SORT_AREA_SIZE to a
           reasonably larger value.

其次,您的查询不是确定性的!!!它强烈依赖于物理表结构,并且没有使用ORDER BY子句来强制正确的顺序。请记住-> ORDER BY是唯一可以对结果集中的行进行排序的方法。如果没有此子句,则关系数据库系统可能以任何顺序返回行。如果需要排序,则应用程序必须在SELECT语句中提供ORDER BY。考虑以下测试案例:
create table TEMP_TABLE_NAME 
as SELECT * FROM (
    select rownum as seq_number , t.*
    from ALL_OBJECTS t
    cross join ( select * from dual connect by level <= 10)
    where rownum <= 100000
)
ORDER BY DBMS_RANDOM.Value;
create unique index TEMP_TABLE_NAME_IDX on TEMP_TABLE_NAME(seq_Number);

select count(*) from TEMP_TABLE_NAME;
  COUNT(*)
----------
    100000

DELETE FROM TEMP_TABLE_NAME
WHERE seq_number between 10000 and 10002
  OR seq_number between 20000 and 20002
  OR seq_number between 30000 and 30002
  OR seq_number between 40000 and 40002
  OR seq_number between 50000 and 50002
  OR seq_number between 60000 and 60002
  ;

如果索引存在,则结果正常。
SELECT T1.*
  FROM (    SELECT ROWNUM seq_number
              FROM DUAL
        CONNECT BY ROWNUM <= 1000000
) T1,
       TEMP_TABLE_NAME T2
 WHERE     T1.seq_number = T2.seq_number(+)
       AND T2.ROWID IS NULL
       AND ROWNUM <= 10
;

SEQ_NUMBER
----------
     10000
     10001
     10002
     20000
     20001
     20002
     30000
     30001
     30002
     40000

但是,如果有一天有人删除了索引,或者优化器因某种原因决定不使用该索引会发生什么?
根据定义:没有ORDER BY语句,关系数据库系统可能以任何顺序返回行。 我使用提示来模拟这些情况:

SELECT /*+ NO_INDEX(T2) */ T1.*
  FROM (    SELECT ROWNUM seq_number
              FROM DUAL
        CONNECT BY ROWNUM <= 1000000
) T1,
       TEMP_TABLE_NAME T2
 WHERE     T1.seq_number = T2.seq_number(+)
       AND T2.ROWID IS NULL
       AND ROWNUM <= 10
;

SEQ_NUMBER
----------
    213856
    910281
    668862
    412743
    295487
    214762
    788486
    346216
    777734
    806457

下面的查询使用ORDER BY子句强制正确的顺序,并且无论适当的索引是否存在,都会给出可重复的结果。
我正在使用推荐的ANSI SQL LEFT JOIN子句,而不是过时的WHERE .... (+)语法。
SELECT  * FROM (
    SELECT /*+ NO_INDEX(T2) */ T1.*
      FROM (    SELECT ROWNUM seq_number
                  FROM DUAL
            CONNECT BY ROWNUM <= 1000000
    ) T1 
    LEFT JOIN TEMP_TABLE_NAME T2
    ON T1.seq_number = T2.seq_number
    WHERE T2.ROWID IS NULL
    ORDER BY T1.seq_number
)
WHERE ROWNUM <= 10

性能
检查性能最简单的方法是进行测试——运行查询10-100次并测量时间:

SET TIMING ON;
DECLARE
   x NUMBER;
BEGIN
   FOR i IN 1..10 LOOP
      SELECT sum( seq_number ) INTO x
      FROM (
           SELECT  * FROM (
            SELECT T1.*
              FROM (    SELECT ROWNUM seq_number
                          FROM DUAL
                    CONNECT BY ROWNUM <= 1000000
            ) T1 
            LEFT JOIN TEMP_TABLE_NAME T2
            ON T1.seq_number = T2.seq_number
            WHERE T2.ROWID IS NULL
            ORDER BY T1.seq_number
            )
            WHERE ROWNUM <= 10
        );
    END LOOP;
END;
/

PL/SQL procedure successfully completed.

Elapsed: 00:00:11.750

10次 - 11.75秒,因此一个查询需要1.2秒。


下一个版本中,CONNECT BY中的限制使用子查询:

SET TIMING ON;
DECLARE
   x NUMBER;
BEGIN
   FOR i IN 1..10 LOOP
      SELECT sum( seq_number ) INTO x
      FROM (
           SELECT  * FROM (
            SELECT T1.*
              FROM (    SELECT ROWNUM seq_number
                          FROM DUAL
                    CONNECT BY ROWNUM <= (Select max( seq_number ) + 10 From TEMP_TABLE_NAME ) 
            ) T1 
            LEFT JOIN TEMP_TABLE_NAME T2
            ON T1.seq_number = T2.seq_number
            WHERE T2.ROWID IS NULL
            ORDER BY T1.seq_number
            )
            WHERE ROWNUM <= 10
        );
    END LOOP;
END;
/
PL/SQL procedure successfully completed.

Elapsed: 00:00:00.986

更好了 - 只需要100毫秒。
这说明,CONNECT BY 部分是最耗费时间的。
另一种尝试是使用一个包含预生成数字序列的表(一种类似于物化视图的方法),而不是在内存中每次生成数字的 CONNECT BY 子查询:
create table seq(
   seq_number int primary key
)
ORGANIZATION INDEX ;

INSERT INTO seq 
SELECT level FROM dual
CONNECT BY LEVEL <= 1000000;

SET TIMING ON;
DECLARE
   x NUMBER;
BEGIN
   FOR i IN 1..10 LOOP
      SELECT sum( seq_number ) INTO x
      FROM (
           SELECT  * FROM (
            SELECT T1.*
            FROM seq T1 
            LEFT JOIN TEMP_TABLE_NAME T2
            ON T1.seq_number = T2.seq_number
            WHERE T2.ROWID IS NULL
            ORDER BY T1.seq_number
            )
            WHERE ROWNUM <= 10
        );
    END LOOP;
END;
/

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.398

这个是最快的 - 仅需40毫秒。

第一个需要1200毫秒,而最后一个只需要40毫秒 - 快了30倍(3000%)。

感谢您的回答和有关如何有效优化工作流程和示例的提示。这真的很有帮助。非常感激。 - pOrinG

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接