使用索引的误区之五:空值的妙用

使用索引的误区之五: 空值的妙用

并不是因为完全为空的条目不被记录到索引中,就坚决不能使用空值,相反,有时候合理使用 oracle 的空值会为我们的查询带来几倍甚至几十倍的效率提高。

举个例子,加入有一个表,里面有个字段是“处理时间”,如果没有处理的事务,该列就为空,并且在大部分情况下,处理的事务总是在总记录数的 10 %或者更少,而等待处理的记录(“处理时间”这列为空)总是绝大多数的记录,那么在“等待时间”这列上建立索引,索引中就总是会保存很少的记录,我们希望的访问方式是,当访问表中所有代处理的记录(即 10 %或者更多的记录数目)时,我们希望通过全表扫描的方式来检索;然而,当我们希望访问已经处理的事务(即 5 %或者更少的记录数目)时,我们希望通过索引来访问,因为索引中的记录数目很少,请看下面的例子:

SQL> create table tt as select * from sys.dba_objects;

Table created

Executed in 0.601 seconds

SQL> alter table tt add (t int);

Table altered

Executed in 0.061 seconds

SQL> select count(*) from tt;

COUNT(*)

----------

6131c

Executed in 0.01 seconds

SQL> UPDATE tt set t=1 where owner='DEMO';

10 rows updated

Executed in 0.03 seconds

SQL> COMMIT;

Commit complete

Executed in 0 seconds

SQL> select count(*) from tt where OWNER='DEMO';

COUNT(*)

----------

10 ――――――――――――――已经处理的数目

Executed in 0.08 seconds

s

SQL> select count(*) from tt;

COUNT(*)

----------

6131 ――――――――――――――总记录数目

Executed in 0.01 seconds

** 下面的查询因为访问表中的大多数记录(代处理的记录,即 10 %以上的记录数目),可以看见,它如我们所希望的那样使用了全表扫描: **

select object_name from tt ** where t is null; **

PLAN_TABLE_OUTPUT

--------------------------------------------------------------------------------

--------------------------------------------------------------------

| Id | Operation | Name | Rows | Bytes | Cost |

--------------------------------------------------------------------

| 0 | SELECT STATEMENT | | | | |

|* 1 | TABLE ACCESS FULL | TT | | | |

--------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

1 - filter("TT"."T" IS NULL)

Note: rule based optimization

14 rows selected

Executed in 0.05 seconds

** 下面的查询因为要访问表中的少数记录,我们希望通过索引来访问: **

select object_name from tt ** where t=1; **

PLAN_TABLE_OUTPUT

--------------------------------------------------------------------------------

--------------------------------------------------------------------

| Id | Operation | Name | Rows | Bytes | Cost |

--------------------------------------------------------------------

| 0 | SELECT STATEMENT | | | | |

|* 1 | TABLE ACCESS FULL | TT | | | |

--------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

1 - filter("TT"."T"=1)

Note: rule based optimization

14 rows selected

Executed in 0.06 seconds

** 请注意,这里并没有如我们所希望的那样使用索引,而是使用了全表扫描,这里有一个结论: **

** 建立了索引后,要想在 CBO 下合理的使用索引,一定要定期的更新统计信息 **


** 下面我们分析一下索引,看看有什么效果: **

SQL> analyze index tt_idx validate structure;

Index analyzed

Executed in 0 seconds

SQL> select lf_rows from index_stats;

** LF_ROWS **

** ---------- **

** 10 ** ** ――――――――――索引中总共有 10 行 **

Executed in 0.05 seconds

SQL> exec dbms_stats.gather_index_stats('DEMO','TT_IDX');

PL/SQL procedure successfully completed

Executed in 0.03 seconds

SQL> SELECT DISTINCT_KEYS FROM USER_INDEXES;

DISTINCT_KEYS

-------------

1 ――――――――――只有一个键值

Executed in 0.05 seconds

SQL> select * from tt ** where t is null; **

已选择 6121 行。

Execution Plan

----------------------------------------------------------

0 SELECT STATEMENT Optimizer=CHOOSE

1 0 TABLE ACCESS (FULL) OF 'TT'

Statistics

----------------------------------------------------------

0 recursive calls

0 db block gets

485 consistent gets

0 physical reads

0 redo size

355012 bytes sent via SQL*Net to client

4991 bytes received via SQL*Net from client

410 SQL*Net roundtrips to/from client

0 sorts (memory)

0 sorts (disk)

6121 rows processed

SQL> select * from tt where t=5;

未选定行

Execution Plan

----------------------------------------------------------

0 SELECT STATEMENT Optimizer=CHOOSE

1 0 TABLE ACCESS (BY INDEX ROWID) OF 'TT'

2 1 INDEX (RANGE SCAN) OF 'TT_IDX' (NON-UNIQUE)

Statistics

----------------------------------------------------------

0 recursive calls

0 db block gets

1 consistent gets

0 physical reads

0 redo size

964 bytes sent via SQL*Net to client

372 bytes received via SQL*Net from client

1 SQL*Net roundtrips to/from client

0 sorts (memory)

0 sorts (disk)

0 rows processed

SQL> select * from tt where t=1;

已选择 10 行。

Execution Plan

----------------------------------------------------------

0 SELECT STATEMENT Optimizer=CHOOSE

1 0 TABLE ACCESS (BY INDEX ROWID) OF 'TT'

2 1 INDEX (RANGE SCAN) OF 'TT_IDX' (NON-UNIQUE)

Statistics

----------------------------------------------------------

0 recursive calls

0 db block gets

4 consistent gets

0 physical reads

0 redo size

1639 bytes sent via SQL*Net to client

503 bytes received via SQL*Net from client

2 SQL*Net roundtrips to/from client

0 sorts (memory)

0 sorts (disk)

10 rows processed

SQL> update tt set t=2 where t=1;

已更新 10 行。

Execution Plan

----------------------------------------------------------

0 UPDATE STATEMENT Optimizer=CHOOSE

1 0 UPDATE OF 'TT'

2 1 INDEX (RANGE SCAN) OF 'TT_IDX' (NON-UNIQUE)

Statistics

----------------------------------------------------------

0 recursive calls

14 db block gets

1 consistent gets

0 physical reads

3216 redo size

616 bytes sent via SQL*Net to client

527 bytes received via SQL*Net from client

3 SQL*Net roundtrips to/from client

2 sorts (memory)

0 sorts (disk)

10 rows processed

SQL> set autotrace traceonly

SQL> update tt set t=3 where t is null;

6121 rows updated.

Execution Plan

----------------------------------------------------------

&nb

Published At
Categories with 数据库类
Tagged with
comments powered by Disqus