Oracle - 材料化视图或表

3

我有一个表(在Oracle 12c中),目前有2200万条记录,每天还会插入10000条记录。我们需要基于这个表进行计数,例如:

select col1, col2, count(*) cnt from my_table group by col1, col2;

这个查询将返回少于30行,而且col1col2的组合将是唯一的。

我们的应用程序需要经常检查CNT值,但是CNT的近似值已经足够好了。这意味着我们可以创建一个物化视图并每10-20分钟刷新它。

对于这个要求,物化视图是一个好选择吗?还是我应该为其创建一个普通表?

谢谢!


MView本身会创建一个包含从视图返回的数据的表格。因此,我看不出有什么区别。 - Naresh
@ILLUMINATI7590,没错,MV只是基于视图的表。但是,通过一个简单的创建语句,它可以创建表、视图、定期填充和刷新表的计划作业;简而言之,就像一站式购物。在适当的情况下,MV非常方便。 - WoMo
2
可以在 col1col2 上分别创建 BITMAP 索引。这样查询速度应该会更快。 - Wernfried Domscheit
2个回答

1

至少有三种不同的方法可以实现这一目标:

  1. Fast Refresh Materialized View A fast refresh materialized view is probably the ideal solution. The 10,000 rows inserted will have a small amount of overhead but then there is no need to rebuild anything; the new totals are available immediately after each commit and retrieving the new totals will be incredibly fast. The downside is that fast refresh materialized views are difficult to setup, and have lots of weird gotchas. They work well with your sample schema but may not work with a more complicated scenario.

    Sample Schema

    drop table my_table;
    
    create table my_table(
        id number not null,
        col1 number not null,
        col2 number not null,
        constraint my_table_pk primary key (id)
    );
    
    insert into my_table
    select level, mod(level, 30), mod(level+1, 30)
    from dual
    connect by level <= 100000;
    
    begin
        dbms_stats.gather_table_stats(user, 'MY_TABLE');
    end;
    /
    

    Create materialized view log and materialized view

    create materialized view log on my_table with rowid(col1, col2) including new values;
    
    create materialized view my_table_mv
    refresh fast on commit
    enable query rewrite as
    select col1, col2, count(*) total
    from my_table
    group by col1, col2;
    

    Query rewrite

    The sample query is silently modified to use the small materialized view instead of the large table.

    explain plan for
    select col1, col2, count(*) cnt
    from my_table
    group by col1, col2;
    
    select * from table(dbms_xplan.display);
    
    Plan hash value: 786752524
    
    --------------------------------------------------------------------------------------------
    | Id  | Operation                    | Name        | Rows  | Bytes | Cost (%CPU)| Time     |
    --------------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT             |             |    30 |   300 |     3   (0)| 00:00:01 |
    |   1 |  MAT_VIEW REWRITE ACCESS FULL| MY_TABLE_MV |    30 |   300 |     3   (0)| 00:00:01 |
    --------------------------------------------------------------------------------------------
    
  2. Compressed B*Tree Index If there are only 30 unique values the index should compress well and not take up much space. Then the index can be used in a fast full index scan and act like a skinny table. This method requires at least one value to be not null. If both could be null then a function-based index could be useful here.

    create index my_table_idx on my_table(col1, col2) compress;
    
  3. Bitmap index Bitmap indexes are small and fast when there are a small number of distinct values. However they can introduce disastrous locking problems for some types of DML.

    create bitmap index my_table_idx on my_table(col1, col2);
    

如果使用直接路径方法(/*+ APPEND */)将行插入表中,则顺便可以放弃使用物化视图日志。 - David Aldridge
@DavidAldridge 我会编辑一下,让我的讨论三个不相关的方法更清晰。物化视图日志对于物化视图是必要的。这样添加10,000条记录就不需要重建整个表了。 - Jon Heller
我认为答案已经足够清晰了。我只是在说(主要是为了未来的读者),如果表中唯一的更改是通过直接路径插入,则刷新提交时不需要MV日志-它们将不会在MV日志中表示,并且Oracle针对直接路径插入使用了不同的机制以便快速刷新提交。 - David Aldridge
我正在费尽心思想要想起用于直接路径快速刷新而不是MV日志的系统表的名称,但是它脱离了我的记忆。 - David Aldridge
这是你需要的 -- 我知道我之前写过关于这个的东西。https://oraclesponge.wordpress.com/2005/11/23/optimizing-materialized-views-part-iii-manual-refresh-mechanisms/ - David Aldridge
显示剩余3条评论

0

根据您的方法的精确程度,您还可以尝试使用SAMPLE子句:

select col1,
       col2,
       count(*) cnt
from   my_table sample(1)
group by col1, col2;

根据值的数据分布,这可能会给出一个合理的估计。您可以测试需要多高的数字才能给出合理的结果,但除非使用块子句,否则很少值得超过4或5:

select col1,
       col2,
       count(*) cnt
from   my_table sample block(10)
group by col1, col2;

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接