如何从结果集中获得随机结果

从Oracle8i开始Oracle提供采样表扫描特性。

Oracle访问数据的基本方法有:
1.全表扫描
2.采样表扫描

全表扫描(Full table Scan)
全表扫描返回表中所有的记录。
执行全表扫描,Oracle读表中的所有记录,考查每一行是否满足WHERE条件。Oracle顺序的读分配给该表的每一个数据块,这样全表扫描能够受益于多块读.
每个数据块Oracle只读一次.

采样表扫描(sample table scan)
采样表扫描返回表中随机采样数据。
这种访问方式需要在FROM语句中包含SAMPLE选项或者SAMPLE BLOCK选项.

SAMPLE选项:
当按行采样来执行一个采样表扫描时,Oracle从表中读取特定百分比的记录,并判断是否满足WHERE子句以返回结果。

SAMPLE BLOCK选项:
使用此选项时,Oracle读取特定百分比的BLOCK,考查结果集是否满足WHERE条件以返回满足条件的纪录.

Sample_Percent:
Sample_Percent是一个数字,定义结果集中包含记录占总记录数量的百分比。
Sample值应该在[0.000001,99.999999]之间。

1.使用SAMPLE选项

> > SQL> select * from employee SAMPLE(30); >
> EMPNO ENAME JOB MGR HIREDATE SAL COMM DEPTNO > ---------- ---------- --------- ---------- --------- ---------- ---------- ---------- > 7369 SMITH CLERK 7902 17-DEC-80 800 20 > 7788 SCOTT ANALYST 7566 19-APR-87 3000 20 > 7839 KING PRESIDENT 17-NOV-81 5000 10 >
>
> Execution Plan > ---------------------------------------------------------- > 0 SELECT STATEMENT Optimizer=CHOOSE (Cost=2 Card=25 Bytes=2175) > 1 0 TABLE ACCESS (SAMPLE) OF 'EMPLOYEE' (Cost=2 Card=25 Bytes=2175) >
>
>
>
> Statistics > ---------------------------------------------------------- > 0 recursive calls > 0 db block gets > 5 consistent gets > 0 physical reads > 0 redo size > 880 bytes sent via SQLNet to client > 503 bytes received via SQLNet from client > 2 SQLNet roundtrips to/from client > 0 sorts (memory) > 0 sorts (disk) > 3 rows processed >
>
> SQL> select * from employee SAMPLE(20); >
> EMPNO ENAME JOB MGR HIREDATE SAL COMM DEPTNO > ---------- ---------- --------- ---------- --------- ---------- ---------- ---------- > 7654 MARTIN SALESMAN 7698 28-SEP-81 1250 1400 30 > 7844 TURNER SALESMAN 7698 08-SEP-81 1500 0 30 >
>
> Execution Plan > ---------------------------------------------------------- > 0 SELECT STATEMENT Optimizer=CHOOSE (Cost=2 Card=16 Bytes=1392) > 1 0 TABLE ACCESS (SAMPLE) OF 'EMPLOYEE' (Cost=2 Card=16 Bytes=1392) >
>
>
>
> Statistics > ---------------------------------------------------------- > 0 recursive calls > 0 db block gets > 5 consistent gets > 0 physical reads > 0 redo size > 839 bytes sent via SQL
Net to client > 503 bytes received via SQLNet from client > 2 SQLNet roundtrips to/from client > 0 sorts (memory) > 0 sorts (disk) > 2 rows processed
>


2.使用SAMPLE BLOCK选项

> > SQL> SELECT * FROM employee SAMPLE BLOCK (50); >
> EMPNO ENAME JOB MGR HIREDATE SAL COMM DEPTNO > ---------- ---------- --------- ---------- --------- ---------- ---------- ---------- > 7369 SMITH CLERK 7902 17-DEC-80 800 20 > 7499 ALLEN SALESMAN 7698 20-FEB-81 1600 300 30 > 7521 WARD SALESMAN 7698 22-FEB-81 1250 500 30 > 7566 JONES MANAGER 7839 02-APR-81 2975 20 > 7654 MARTIN SALESMAN 7698 28-SEP-81 1250 1400 30 > 7698 BLAKE MANAGER 7839 01-MAY-81 2850 30 > 7782 CLARK MANAGER 7839 09-JUN-81 2450 10 > 7788 SCOTT ANALYST 7566 19-APR-87 3000 20 > 7839 KING PRESIDENT 17-NOV-81 5000 10 > 7844 TURNER SALESMAN 7698 08-SEP-81 1500 0 30 >
> 10 rows selected. >
>
> Execution Plan > ---------------------------------------------------------- > 0 SELECT STATEMENT Optimizer=CHOOSE (Cost=2 Card=41 Bytes=3567) > 1 0 TABLE ACCESS (SAMPLE) OF 'EMPLOYEE' (Cost=2 Card=41 Bytes=3567) >
>
>
>
> Statistics > ---------------------------------------------------------- > 0 recursive calls > 0 db block gets > 4 consistent gets > 0 physical reads > 0 redo size > 1162 bytes sent via SQLNet to client > 503 bytes received via SQLNet from client > 2 SQL*Net roundtrips to/from client > 0 sorts (memory) > 0 sorts (disk) > 10 rows processed >
> SQL> >


3.采样前n条记录的查询

也可以使用dbms_random包实现

> > SQL> select * from ( > 2 select * from employee > 3 order by dbms_random.value ) > 4 where rownum <= 4;
>
> EMPNO ENAME JOB MGR HIREDATE SAL COMM DEPTNO > ---------- ---------- --------- ---------- --------- ---------- ---------- ---------- > 7654 MARTIN SALESMAN 7698 28-SEP-81 1250 1400 30 > 7839 KING PRESIDENT 17-NOV-81 5000 10 > 7369 SMITH CLERK 7902 17-DEC-80 800 20 > 7788 SCOTT ANALYST 7566 19-APR-87 3000 20 >
>
> Execution Plan > ---------------------------------------------------------- > 0 SELECT STATEMENT Optimizer=CHOOSE > 1 0 COUNT (STOPKEY) > 2 1 VIEW > 3 2 SORT (ORDER BY STOPKEY) > 4 3 TABLE ACCESS (FULL) OF 'EMPLOYEE' >
>
>
>
> Statistics > ---------------------------------------------------------- > 0 recursive calls > 0 db block gets > 3 consistent gets > 0 physical reads > 0 redo size > 927 bytes sent via SQLNet to client > 503 bytes received via SQLNet from client > 2 SQL*Net roundtrips to/from client > 1 sorts (memory) > 0 sorts (disk) > 4 rows processed >


对比一下SAMPLE选项

> > SQL> SELECT * FROM employee SAMPLE (40); >
> EMPNO ENAME JOB MGR HIREDATE SAL COMM DEPTNO > ---------- ---------- --------- ---------- --------- ---------- ---------- ---------- > 7499 ALLEN SALESMAN 7698 20-FEB-81 1600 300 30 > 7521 WARD SALESMAN 7698 22-FEB-81 1250 500 30 > 7698 BLAKE MANAGER 7839 01-MAY-81 2850 30 > 7839 KING PRESIDENT 17-NOV-81 5000 10 > 7844 TURNER SALESMAN 7698 08-SEP-81 1500 0 30 >
>
> Execution Plan > ---------------------------------------------------------- > 0 SELECT STATEMENT Optimizer=CHOOSE (Cost=2 Card=33 Bytes=2871) > 1 0 TABLE ACCESS (SAMPLE) OF 'EMPLOYEE' (Cost=2 Card=33 Bytes=2871) >
>
>
>
> Statistics > ---------------------------------------------------------- > 0 recursive calls > 0 db block gets > 5 consistent gets > 0 physical reads > 0 redo size > 961 bytes sent via SQLNet to client > 503 bytes received via SQLNet from client > 2 SQL*Net roundtrips to/from client > 0 sorts (memory) > 0 sorts (disk) > 5 rows processed >
> SQL> >


主要注意以下几点:

1.sample只对单表生效,不能用于表连接和远程表
2.sample会使SQL自动使用CBO

本文作者:
eygle,Oracle技术关注者,来自中国最大的Oracle技术论坛 itpub .
www.eygle.com 是作者的个人站点.你可通过 [email protected] 来联系作者.欢迎技术探讨交流以及链接交换.


原文出处:

http://www.eygle.com/sql/How.To.Get.Random.Output.Of.Record.Set.htm


Published At
Categories with 数据库类
Tagged with
comments powered by Disqus