[中文]
Author:
fuyuncat
Source:
www.HelloDBA.com
Date:
2011-07-21 08:01:37
NULLABILITY is an important property of column. However, many designers do not care naught for this property. In most scenarios, except primary key columns, all columns are not specified nullbility. By default, it's means nullable.
Actually, it's an importent factor impacting optimizer choosing execution plan. Let's build below table to further describe this issue.
SQL代码
- HELLODBA.COM>create table t_test1 tablespace DEMO as select * from dba_objects;
- Table created.
- HELLODBA.COM>alter table T_TEST1 add constraint T_TEST1_PK primary key (OBJECT_ID) using index tablespace DEMOTSINX;
- Table altered.
- HELLODBA.COM>update t_test1 set SUBOBJECT_NAME=OBJECT_NAME where SUBOBJECT_NAME is null;
- 32072 rows updated.
- HELLODBA.COM>commit;
- Commit complete.
- HELLODBA.COM>desc t_test1
- Name Null? Type
- ---------------------------------------------------------------------------------------------------------------------------------
- OWNER NOT NULL VARCHAR2(30)
- OBJECT_NAME NOT NULL VARCHAR2(30)
- SUBOBJECT_NAME VARCHAR2(30)
- OBJECT_ID NOT NULL NUMBER
- DATA_OBJECT_ID NUMBER
- OBJECT_TYPE VARCHAR2(19)
- CREATED NOT NULL DATE
- LAST_DDL_TIME NOT NULL DATE
- TIMESTAMP VARCHAR2(19)
- STATUS VARCHAR2(7)
- TEMPORARY VARCHAR2(1)
- GENERATED VARCHAR2(1)
- SECONDARY VARCHAR2(1)
- LIO NUMBER
Where clause predication evaluation
In the table T_TEST1, subobject_name actually does not contain NULL values, and there isn't NOT NULL constraint on it. We execute below SQL to query the number of NULL value in this column (zero of course).
SQL代码
- HELLODBA.COM>select * from t_test1 where SUBOBJECT_NAME is null;
- no rows selected
- Execution Plan
- ----------------------------------------------------------
- Plan hash value: 1883417357
- -----------------------------------------------------------------------------
- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
- -----------------------------------------------------------------------------
- | 0 | SELECT STATEMENT | | 1 | 96 | 45 (0)| 00:00:46 |
- |* 1 | TABLE ACCESS FULL| T_TEST1 | 1 | 96 | 45 (0)| 00:00:46 |
- -----------------------------------------------------------------------------
- Predicate Information (identified by operation id):
- ---------------------------------------------------
- 1 - filter("SUBOBJECT_NAME" IS NULL)
- Statistics
- ----------------------------------------------------------
- 0 recursive calls
- 0 db block gets
- 665 consistent gets
- 0 physical reads
- 0 redo size
- 1048 bytes sent via SQL*Net to client
- 374 bytes received via SQL*Net from client
- 1 SQL*Net roundtrips to/from client
- 0 sorts (memory)
- 0 sorts (disk)
- 0 rows processed
We find it involved Full Table Scan (we will discuss impact on index in following part). We can not the plan and performance changed after we add NOT NULL constraint on it.
SQL代码
- HELLODBA.COM>alter table t_test1 modify SUBOBJECT_NAME not null;
- Table altered.
- HELLODBA.COM>select * from t_test1 where SUBOBJECT_NAME is null;
- no rows selected
- Execution Plan
- ----------------------------------------------------------
- Plan hash value: 4146611218
- ------------------------------------------------------------------------------
- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
- ------------------------------------------------------------------------------
- | 0 | SELECT STATEMENT | | 1 | 96 | 0 (0)| |
- |* 1 | FILTER | | | | | |
- | 2 | TABLE ACCESS FULL| T_TEST1 | 47585 | 4461K| 45 (0)| 00:00:46 |
- ------------------------------------------------------------------------------
- Predicate Information (identified by operation id):
- ---------------------------------------------------
- 1 - filter(NULL IS NOT NULL)
- Statistics
- ----------------------------------------------------------
- 0 recursive calls
- 0 db block gets
- 0 consistent gets
- 0 physical reads
- 0 redo size
- 1048 bytes sent via SQL*Net to client
- 374 bytes received via SQL*Net from client
- 1 SQL*Net roundtrips to/from client
- 0 sorts (memory)
- 0 sorts (disk)
- 0 rows processed
The execution plan adds a FITLER before Full Table Scan, with an expression (NULL IS NOT NULL), whose result is FALSE. Hence, child operation of FILTER actually is not performed. Correspondingly, CR in performance statistics data is 0.
The reason of why optimizer added such FILTER, is because optimizer refers to the NOT NULL constraint to evaluate WHERE clause when perform Query Transformation. If the resutl is evaluated as FALSE, it will add such FITLER to avoid high-cost operations. From 10053 trace file, we notice the evaluation impact on cost estimation.
SQL代码
- Cdn, Cost adjusted (to ~ 0) as where clause evalutes to FALSE
- Final - All Rows Plan: Best join order: 1
- Cost: 0.0000 Degree: 1 Card: 1.0000 Bytes: 4568160
- Resc: 0.0000 Resc_io: 0.0000 Resc_cpu: 0
- Resp: 0.0000 Resp_io: 0.0000 Resc_cpu: 0
NOT NULL constraint impact on index choosing
As we know, B*Tree index will not contain NULL keys. That is to say, records with NULL values in all indexed columns will not be built in the index. It also means that table records do not full map to index records if there is no NOT NULL constraint.
Now, we remove the NOT NULL constraint on subobject_name and index it.
SQL代码
- HELLODBA.COM>alter table t_test1 modify subobject_name null;
- Table altered.
- HELLODBA.COM>create index t_test1_subo_idx on t_test1(subobject_name) compute statistics;
- Index created.
Then execute below query to obtain 10 records with smallest subobject_name. To make it more effecient, we want to get the first 10 ROWIDs from index, which has be sorted by subobject_name, then access table to fetch data by ROWID.
SQL代码
- HELLODBA.COM>select owner,object_name,subobject_name from t_test1 t1, (select /*+index(t t_test1_subo_idx)*/rowid rid from t_test1 t where rownum<=10order by subobject_name) v where t1.rowid=v.rid;
- OWNER OBJECT_NAME SUBOBJECT_NAME
- ------------------------------ ------------------------------ ------------------------------
- SYS ICOL$ BBB
- SYS I_USER1 BBB
- SYS CON$ BBB
- SYS UNDO$ BBB
- SYS I_PROXY_ROLE_DATA$_1 BBB
- SYS I_OBJ# BBB
- SYS PROXY_ROLE_DATA$ BBB
- SYS I_IND1 BBB
- SYS I_CDEF2 BBB
- SYS C_COBJ# BBB
- 10 rows selected.
- Execution Plan
- ----------------------------------------------------------
- Plan hash value: 4050478946
- -----------------------------------------------------------------------------------------------
- | Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
- -----------------------------------------------------------------------------------------------
- | 0 | SELECT STATEMENT | | 10 | 560 | | 308 (1)| 00:05:09 |
- | 1 | NESTED LOOPS | | 10 | 560 | | 308 (1)| 00:05:09 |
- | 2 | VIEW | | 10 | 120 | | 298 (1)| 00:04:59 |
- | 3 | SORT ORDER BY | | 10 | 160 | 2248K| 298 (1)| 00:04:59 |
- |* 4 | COUNT STOPKEY | | | | | | |
- | 5 | TABLE ACCESS FULL | T_TEST1 | 47585 | 743K| | 45 (0)| 00:00:46 |
- | 6 | TABLE ACCESS BY USER ROWID| T_TEST1 | 1 | 44 | | 1 (0)| 00:00:02 |
- -----------------------------------------------------------------------------------------------
- Predicate Information (identified by operation id):
- ---------------------------------------------------
- 4 - filter(ROWNUM<=10)
- Statistics
- ----------------------------------------------------------
- 0 recursive calls
- 0 db block gets
- 14 consistent gets
- 0 physical reads
- 0 redo size
- 707 bytes sent via SQL*Net to client
- 385 bytes received via SQL*Net from client
- 2 SQL*Net roundtrips to/from client
- 1 sorts (memory)
- 0 sorts (disk)
- 10 rows processed
However, according to the execution plan and result, it does not run as we expected, and the result is also wrong. The reason is that optimizer is not sure index containing all records because of un-indexed NULL values.
We get what we want as soon as we added the constraint.
SQL代码
- HELLODBA.COM>alter table t_test1 modify subobject_name not null;
- Table altered.
- HELLODBA.COM>select owner,object_name,subobject_name from t_test1 t1, (select /*+index(t t_test1_subo_idx)*/rowid rid from t_test1 t where rownum<=10
- order by subobject_name) v where t1.rowid=v.rid;
- OWNER OBJECT_NAME SUBOBJECT_NAME
- ------------------------------ ------------------------------ ------------------------------
- DEMO NO A
- DEMO NO A
- DEMO NO A
- SYS ICOL$ BBB
- SYS I_USER1 BBB
- SYS CON$ BBB
- SYS UNDO$ BBB
- SYS C_COBJ# BBB
- SYS I_OBJ# BBB
- SYS PROXY_ROLE_DATA$ BBB
- 10 rows selected.
- Execution Plan
- ----------------------------------------------------------
- Plan hash value: 3198566056
- ------------------------------------------------------------------------------------------------
- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
- ------------------------------------------------------------------------------------------------
- | 0 | SELECT STATEMENT | | 10 | 560 | 71 (0)| 00:01:11 |
- | 1 | NESTED LOOPS | | 10 | 560 | 71 (0)| 00:01:11 |
- | 2 | VIEW | | 10 | 120 | 61 (0)| 00:01:01 |
- |* 3 | COUNT STOPKEY | | | | | |
- | 4 | INDEX FULL SCAN | T_TEST1_SUBO_IDX | 47585 | 743K| 61 (0)| 00:01:01 |
- | 5 | TABLE ACCESS BY USER ROWID| T_TEST1 | 1 | 44 | 1 (0)| 00:00:02 |
- ------------------------------------------------------------------------------------------------
- Predicate Information (identified by operation id):
- ---------------------------------------------------
- 3 - filter(ROWNUM<=10)
- Statistics
- ----------------------------------------------------------
- 0 recursive calls
- 0 db block gets
- 13 consistent gets
- 0 physical reads
- 0 redo size
- 681 bytes sent via SQL*Net to client
- 385 bytes received via SQL*Net from client
- 2 SQL*Net roundtrips to/from client
- 0 sorts (memory)
- 0 sorts (disk)
- 10 rows processed
NOT NULL constraint impact on JOIN
NOT NULL constraint also impacts on optimizer choosing JOIN methods in excution plan. Let's create anthoer table.
SQL代码
- HELLODBA.COM>create table t_test2 tablespace DEMO as select * from dba_tables;
- Table created.
- HELLODBA.COM>alter table T_TEST2 add constraint T_TEST2_PK primary key (OWNER,TABLE_NAME) using index tablespace DEMOTSINX;
- Table altered.
- HELLODBA.COM>desc t_test2
- Name Null? Type
- -------------- -------- -----------------
- OWNER NOT NULL VARCHAR2(30)
- TABLE_NAME NOT NULL VARCHAR2(30)
- TABLESPACE_NAME VARCHAR2(30)
- CLUSTER_NAME VARCHAR2(30)
- IOT_NAME VARCHAR2(30)
- STATUS VARCHAR2(8)
- PCT_FREE NUMBER
- PCT_USED NUMBER
- ...
Remove the NOT NULL constraint again.
SQL代码
- HELLODBA.COM>alter table t_test1 modify subobject_name null;
- Table altered.
We execute below query to find records in T_TEST1 with subobject_name is not table_name.
SQL代码
- HELLODBA.COM>select t1.owner, t1.object_name, t1.subobject_name from t_test1 t1 where subobject_name not in (select table_name from t_test2 t2);
- 45135 rows selected.
- Execution Plan
- ----------------------------------------------------------
- Plan hash value: 3538907136
- ------------------------------------------------------------------------------
- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
- ------------------------------------------------------------------------------
- | 0 | SELECT STATEMENT | | 2379 | 76128 | 51 (0)| 00:00:52 |
- |* 1 | FILTER | | | | | |
- | 2 | TABLE ACCESS FULL| T_TEST1 | 47585 | 1487K| 45 (0)| 00:00:46 |
- |* 3 | TABLE ACCESS FULL| T_TEST2 | 1 | 18 | 6 (0)| 00:00:07 |
- ------------------------------------------------------------------------------
- Predicate Information (identified by operation id):
- ---------------------------------------------------
- 1 - filter( NOT EXISTS (SELECT /*+ */ 0 FROM "T_TEST2" "T2" WHERE
- LNNVL("TABLE_NAME"<>:B1)))
- 3 - filter(LNNVL("TABLE_NAME"<>:B1))
- Statistics
- ----------------------------------------------------------
- 392 recursive calls
- 0 db block gets
- 2217674 consistent gets
- 0 physical reads
- 0 redo size
- 2329590 bytes sent via SQL*Net to client
- 33473 bytes received via SQL*Net from client
- 3010 SQL*Net roundtrips to/from client
- 5 sorts (memory)
- 0 sorts (disk)
- 45135 rows processed
We notcie that execution plan invovled with function LNNVL, NOT EXISTS as well as FILTER to get the result, with poor performance.
Note: The LNNVL function provides a concise way to evaluate a condition when one or both operands of the condition may be null.
And we add the NOT NULL constraint again.
SQL代码
- HELLODBA.COM>alter table t_test1 modify subobject_name not null;
- Table altered.
- HELLODBA.COM>select /*+ordered*/t1.owner, t1.object_name, t1.subobject_name from t_test1 t1 where not exists (select 1 from t_test2 t2 where t1.subobject_name = t2.table_name);
- 45135 rows selected.
- Execution Plan
- ----------------------------------------------------------
- Plan hash value: 3049160152
- ---------------------------------------------------------------------------------------
- | Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
- ---------------------------------------------------------------------------------------
- | 0 | SELECT STATEMENT | | 1 | 50 | | 475 (1)| 00:07:56 |
- | 1 | MERGE JOIN ANTI | | 1 | 50 | | 475 (1)| 00:07:56 |
- | 2 | SORT JOIN | | 47585 | 1487K| 4136K| 458 (1)| 00:07:39 |
- | 3 | TABLE ACCESS FULL| T_TEST1 | 47585 | 1487K| | 45 (0)| 00:00:46 |
- |* 4 | SORT UNIQUE | | 2070 | 37260 | 120K| 17 (6)| 00:00:18 |
- | 5 | TABLE ACCESS FULL| T_TEST2 | 2070 | 37260 | | 6 (0)| 00:00:07 |
- ---------------------------------------------------------------------------------------
- Predicate Information (identified by operation id):
- ---------------------------------------------------
- 4 - access("T1"."SUBOBJECT_NAME"="T2"."TABLE_NAME")
- filter("T1"."SUBOBJECT_NAME"="T2"."TABLE_NAME")
- Statistics
- ----------------------------------------------------------
- 41 recursive calls
- 141 db block gets
- 882 consistent gets
- 1432 physical reads
- 0 redo size
- 2050667 bytes sent via SQL*Net to client
- 33473 bytes received via SQL*Net from client
- 3010 SQL*Net roundtrips to/from client
- 0 sorts (memory)
- 2 sorts (disk)
- 45135 rows processed
The execution plan adopts ANTI-JOIN to get data, with better performance.
As we know, we should use IS NULL or IS NOT NULL to identify NULL values, otherwise, the expression will always reture FALSE if it involved NULL. That's why NULL VALUES do not impact JOIN or SEMI-JOIN.
SQL代码
- HELLODBA.COM>select 1 from dual where null='';
- no rows selected
- HELLODBA.COM>select 1 from dual where null=null;
- no rows selected
- HELLODBA.COM>select 1 from dual where null is null;
- 1
- ----------
- 1
ANTI-JOIN identifies data with JOIN methods (NESTED-LOOP, MERGE, HASH). Once it finds records matched with JOIN criteria, it spots the record should be discarded and will not match remaining data in a data set, which quite improved performance. When join data, it involves equal (=) matcher, which is not suitable for NULL data, therefore, optimizer will not adopt ANTI-JOIN if there is potential NULL values in the join columns.
Note: with NULL values or not, NOT EXISTS clause could adopt ANTI-JOIN, but its logic is not same as NOT IN. The results are different in below case.
SQL代码
- HELLODBA.COM>alter table t_test1 modify subobject_name null;
- Table altered.
- HELLODBA.COM>update t_test1 set subobject_name=null where rownum <=10;
- 10 rows updated.
- HELLODBA.COM>select t1.owner, t1.object_name, t1.subobject_name from t_test1 t1 where subobject_name not in (select table_name from t_test2 t2);
- 45129 rows selected.
- Execution Plan
- ----------------------------------------------------------
- Plan hash value: 3538907136
- ------------------------------------------------------------------------------
- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
- ------------------------------------------------------------------------------
- | 0 | SELECT STATEMENT | | 2379 | 76128 | 51 (0)| 00:00:52 |
- |* 1 | FILTER | | | | | |
- | 2 | TABLE ACCESS FULL| T_TEST1 | 47585 | 1487K| 45 (0)| 00:00:46 |
- |* 3 | TABLE ACCESS FULL| T_TEST2 | 1 | 18 | 6 (0)| 00:00:07 |
- ------------------------------------------------------------------------------
- Predicate Information (identified by operation id):
- ---------------------------------------------------
- 1 - filter( NOT EXISTS (SELECT /*+ */ 0 FROM "T_TEST2" "T2" WHERE
- LNNVL("TABLE_NAME"<>:B1)))
- 3 - filter(LNNVL("TABLE_NAME"<>:B1))
- Statistics
- ----------------------------------------------------------
- 745 recursive calls
- 216 db block gets
- 2217413 consistent gets
- 13 physical reads
- 0 redo size
- 2329442 bytes sent via SQL*Net to client
- 33473 bytes received via SQL*Net from client
- 3010 SQL*Net roundtrips to/from client
- 26 sorts (memory)
- 0 sorts (disk)
- 45129 rows processed
- HELLODBA.COM>select t1.owner, t1.object_name, t1.subobject_name from t_test1 t1 where not exists (select 1 from t_test2 t2 where t1.subobject_name = t
- 2.table_name);
- 45139 rows selected.
- Execution Plan
- ----------------------------------------------------------
- Plan hash value: 2383621862
- --------------------------------------------------------------------------------
- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
- --------------------------------------------------------------------------------
- | 0 | SELECT STATEMENT | | 1 | 50 | 52 (2)| 00:00:52 |
- |* 1 | HASH JOIN RIGHT ANTI| | 1 | 50 | 52 (2)| 00:00:52 |
- | 2 | TABLE ACCESS FULL | T_TEST2 | 2070 | 37260 | 6 (0)| 00:00:07 |
- | 3 | TABLE ACCESS FULL | T_TEST1 | 47585 | 1487K| 45 (0)| 00:00:46 |
- --------------------------------------------------------------------------------
- Predicate Information (identified by operation id):
- ---------------------------------------------------
- 1 - access("T1"."SUBOBJECT_NAME"="T2"."TABLE_NAME")
- Statistics
- ----------------------------------------------------------
- 49 recursive calls
- 360 db block gets
- 4130 consistent gets
- 0 physical reads
- 0 redo size
- 2329713 bytes sent via SQL*Net to client
- 33484 bytes received via SQL*Net from client
- 3011 SQL*Net roundtrips to/from client
- 2 sorts (memory)
- 0 sorts (disk)
- 45139 rows processed
- HELLODBA.COM>rollback;
- Rollback complete.
--- Fuyuncat ---