HelloDBA [中文]
Search Internet Search HelloDBA
  Oracle Technology Site. email: fuyuncat@gmail.com  MSN: fuyuncat@hotmail.com Add to circle  acoug  acoug 

CBO Statistics data gathering and internal arithmetic analysis -- Converting to endpoint value

[中文]

Author:  fuyuncat

Source:  www.HelloDBA.com

Date:  2010-01-13 03:06:07

Share to  Twitter Facebook GMail Blogger Orkut Google Bookmarks Sina Weibo QQ Renren Tieba Kaixin Douban Taobao

3. Converting to endpoint value 

    If we draw the histogram, the endpoint value should be x axis, and the endpoint number should be y axis. For those columns with histogram, oracle will convert their values to the data type of endpoint value, say, number. We will discuss how does the endpoint number and endpoint value be stored.
 

Endpoint Number

    For Endpoint Number, I have concluded it in previous paper "Frequency or Height Balanced".

  •   In Frequency Histogram, the Endpoint Number is the accumunltive records number of the values in sequence. For example, there are 2 records are "1", 1 record is "2", 5 records are "3" ..., then, the correspond buckets' Endpoint Number are 2, 3(2+1), 8(3+5);
  •   In Height Balanced Histogram, the Endpoint Number is the number of average bucket. For example, value 1,2,3 are located in the 1st bucket; value 4 is a popular value, which occupied 2 buckets, who has been compressed as 1 bucket; value 5 is located in the 3rd bucket..., then, Endpoint Number of these 3 buckets are 1, 3(2+1), 4(3+1).

Endpoing Value

    We can find the data type of endpoint value from the dictionary table histgrm$. It means all kinds of datatype will be converted to number finally.
 

SQL代码
  1. HELLODBA.COM>desc histgrm$   
  2.  Name          Null?    Type   
  3.  ------------- -------- -------------------------------------   
  4.  OBJ#          NOT NULL NUMBER   
  5.  COL#          NOT NULL NUMBER   
  6.  ROW#                   NUMBER   
  7.  BUCKET        NOT NULL NUMBER   
  8.  ENDPOINT      NOT NULL NUMBER   
  9.  INTCOL#       NOT NULL NUMBER   
  10.  EPVALUE                VARCHAR2(1000)   
  11.  SPARE1                 NUMBER   
  12.  SPARE2                 NUMBER   

    Let's look into the conversion one datatype by one.
 

    Note: Data precision to be analyzed for creating histogram is not same as the endpoint value data precision, which will also be discussed below.
 

    Let's prepare the test data first.
 

SQL代码
  1. HELLODBA.COM>create table demo.htc3 (a number, b date, c raw(100), d varchar2(100), e clob, f rowid);   
  2.   
  3. Table created.   
  4.   
  5. HELLODBA.COM>insert into demo.htc3 values (1, to_date('2010-12-07 00:00:01''YYYY-MM-DD HH24:MI:SS'), '01''A''A''AAAxdYAAFAAAPJUAAA');   
  6.   
  7. 1 row created.   
  8.   
  9. HELLODBA.COM>insert into demo.htc3 values (2, to_date('2010-12-07 00:00:02''YYYY-MM-DD HH24:MI:SS'), '02''BB''B','AAAxdYAAFAAAPJUAAB');   
  10.   
  11. 1 row created.   
  12.   
  13. HELLODBA.COM>insert into demo.htc3 values (3, to_date('2010-12-07 00:00:03''YYYY-MM-DD HH24:MI:SS'), '03''CCC''C','AAAxdYAAFAAAPJUAAC');   
  14.   
  15. 1 row created.   
  16.   
  17. HELLODBA.COM>insert into demo.htc3 values (4, to_date('2010-12-07 00:00:04''YYYY-MM-DD HH24:MI:SS'), '04''DDDDD''D','AAAxdYAAFAAAPJUAAD');   
  18.   
  19. 1 row created.   
  20.   
  21. HELLODBA.COM>insert into demo.htc3 values (5, to_date('2010-12-07 00:00:05''YYYY-MM-DD HH24:MI:SS'), '05''EEEEEE''E','AAAxdYAAFAAAPJUAAE');   
  22.   
  23. 1 row created.   
  24.   
  25. HELLODBA.COM>insert into demo.htc3 values (6, to_date('2010-12-07 12:50:01''YYYY-MM-DD HH24:MI:SS'), '06''FFFFFFF''F','AAAxdYAAFAAAPJUAAF');   
  26.   
  27. 1 row created.   
  28.   
  29. HELLODBA.COM>insert into demo.htc3 values (7.654321, to_date('2010-12-07 12:50:02''YYYY-MM-DD HH24:MI:SS')+1, '07''FFFFFF1''G','AAAxdYAAFAAAPJUAAG');   
  30.   
  31. 1 row created.   
  32.   
  33. HELLODBA.COM>insert into demo.htc3 values (8.7654321, to_date('2010-12-07 12:50:03''YYYY-MM-DD HH24:MI:SS')+2, '08''FFFFFF2''H','AAAxdYAAFAAAPJUAAH');   
  34.   
  35. 1 row created.   
  36.   
  37. HELLODBA.COM>insert into demo.htc3 values (9.87654321, to_date('2010-12-07 12:50:04''YYYY-MM-DD HH24:MI:SS')+3, '09''FFFFFF3''I','AAAxdYAAFAAAPJUAAI');   
  38.   
  39. 1 row created.   
  40.   
  41. HELLODBA.COM>insert into demo.htc3 values (10.987654321, to_date('2010-12-07 12:50:05''YYYY-MM-DD HH24:MI:SS'), '0A''FFFFFFF''J','AAAxdYAAFAAAPJUAAJ');   
  42.   
  43. 1 row created.   
  44.   
  45. HELLODBA.COM>insert into demo.htc3 values (2.123456789123123456789123456789123456789E33, to_date('2010-12-07 12:50:01''YYYY-MM-DD HH24:MI:SS')+100,'AC1265231212CDAC1265231212CDAC1265231212CDAC1265231212CDAC1265231212CD', lpad('A',35,'C'), lpad('A',35,'C'),'AAAxdYAAFAAAPJUAAK');   
  46.   
  47. 1 row created.   
  48.   
  49. HELLODBA.COM>insert into demo.htc3 values (2.123456789123123456789123456789123456790E33, to_date('2010-12-07 12:50:01''YYYY-MM-DD HH24:MI:SS')+101,'AC1265231212CDAC1265231212CDAC1265231212CDAC1265231212CDAC1265231212AB', lpad('A',35,'C')||'1', lpad('A',35,'C')||'1','AAAxdYAAFAAAPJUAAL');   
  50.   
  51. 1 row created.   
  52.   
  53. HELLODBA.COM>insert into demo.htc3 values (2.123456789123123456789123456789123456789E35, to_date('2010-12-07 12:50:59''YYYY-MM-DD HH24:MI:SS'), 'AC1265231212CDAC1265231212CDAC1265231212CDAC1265231212CDAC1265231212EF', lpad('A',35,'C')||'2', lpad('A',35,'C')||'2','AAAxdYAAFAAAPJUAAM');   
  54.   
  55. 1 row created.   
  56.   
  57. HELLODBA.COM>commit;   
  58.   
  59. Commit complete.   
  60.   
  61. HELLODBA.COM>set serveroutput on  
  62. HELLODBA.COM>begin  
  63.   2   dbms_output.enable(1000000);   
  64.   3   dbms_stats.set_param('TRACE',16383);   
  65.   4   dbms_stats.gather_table_stats('DEMO','HTC3',NULL,0,FALSE,'FOR ALL COLUMNS');   
  66.   5  end;   
  67.   6  /   
  68. ...   

Number (NUMBER and the subtypes)

    We can acknowledge oracle use the original data to group the bucket for histogram from the traced query.
 

SQL代码
  1. select substrb(dump(val, 16, 0, 32), 1, 120) ep, cnt   
  2.   from (select /*+ no_parallel(t) no_parallel_index(t) dbms_stats cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring */   
  3.          "A" val, count(*) cnt   
  4.           from "DEMO"."HTC3" t   
  5.          where "A" is not null  
  6.          group by "A")   
  7.  order by val   

    To store a number value as endpoint value, oracle will ROUND it to the 15th position start from left end.
 

SQL代码
  1. HELLODBA.COM>set numw 50   
  2. HELLODBA.COM>select endpoint_number,   
  3.   2         endpoint_value   
  4.   3    from dba_histograms   
  5.   4   where owner = 'DEMO'  
  6.   5     and table_name = 'HTC3'  
  7.   6     and column_name = 'A';   
  8.   
  9.                          ENDPOINT_NUMBER                           ENDPOINT_VALUE   
  10. ---------------------------------------- ----------------------------------------   
  11.                                        1                                        1   
  12.                                        2                                        2   
  13.                                        3                                        3   
  14.                                        4                                        4   
  15.                                        5                                        5   
  16.                                        6                                        6   
  17.                                        7                                 7.654321   
  18.                                        8                                8.7654321   
  19.                                        9                               9.87654321   
  20.                                       10                             10.987654321   
  21.                                       11       2123456789123120000000000000000000   
  22.                                       12       2123456789123120000000000000000000   
  23.                                       13     212345678912312000000000000000000000   
  24.   
  25. 13 rows selected.   

    Wo can calculate the endpoint value from actual data value with rounding.
 

SQL代码
  1. HELLODBA.COM>select a, round(a,15-length(trunc(a))) from demo.htc3;   
  2.   
  3.                                        A             ROUND(A,15-LENGTH(TRUNC(A)))   
  4. ---------------------------------------- ----------------------------------------   
  5.                                        1                                        1   
  6.                                        2                                        2   
  7.                                        3                                        3   
  8.                                        4                                        4   
  9.                                        5                                        5   
  10.                                        6                                        6   
  11.                                 7.654321                                 7.654321   
  12.                                8.7654321                                8.7654321   
  13.                               9.87654321                               9.87654321   
  14.                             10.987654321                             10.987654321   
  15. 2123456789123123456789123456789123.45679       2123456789123120000000000000000000   
  16. 2123456789123123456789123456789123.45679       2123456789123120000000000000000000   
  17. 212345678912312345678912345678912345.679     212345678912312000000000000000000000   
  18.   
  19. 13 rows selected.   

    Because the difference of precision between data to create histogram and the endpoint value, there may be multiple buckets with same endpoint value. This will confuse the optimizer to estimate the cost. Look at below case.
 

SQL代码
  1. HELLODBA.COM>create table demo.htc5 (a number);   
  2.   
  3. Table created.   
  4.   
  5. HELLODBA.COM>insert into demo.htc5 values(123456789.123456789);   
  6.   
  7. 1 row created.   
  8.   
  9. HELLODBA.COM>insert into demo.htc5 values(123456789.123456799);   
  10.   
  11. 1 row created.   
  12.   
  13. HELLODBA.COM>insert into demo.htc5 values(123456789.123456799);   
  14.   
  15. 1 row created.   
  16.   
  17. HELLODBA.COM>insert into demo.htc5 values(123456789.123456799);   
  18.   
  19. 1 row created.   
  20.   
  21. HELLODBA.COM>insert into demo.htc5 values(123456789.123456799);   
  22.   
  23. 1 row created.   
  24.   
  25. HELLODBA.COM>insert into demo.htc5 values(123456789.123456799);   
  26.   
  27. 1 row created.   
  28.   
  29. HELLODBA.COM>insert into demo.htc5 values(123456799.123456799);   
  30.   
  31. 1 row created.   
  32.   
  33. HELLODBA.COM>insert into demo.htc5 values(123456799.123456799);   
  34.   
  35. 1 row created.   
  36.   
  37. HELLODBA.COM>commit;   
  38.   
  39. Commit complete.   
  40.   
  41. HELLODBA.COM>begin  
  42.   2   dbms_output.enable(1000000);   
  43.   3   dbms_stats.set_param('TRACE',16383);   
  44.   4   dbms_stats.gather_table_stats('DEMO','HTC5',NULL,0,FALSE,'FOR ALL COLUMNS');   
  45.   5  end;   
  46.   6  /   
  47.   
  48. PL/SQL procedure successfully completed.   
  49.   
  50. HELLODBA.COM>select endpoint_number,   
  51.   2         endpoint_value   
  52.   3    from dba_histograms   
  53.   4   where owner = 'DEMO'  
  54.   5     and table_name = 'HTC5'  
  55.   6     and column_name = 'A';   
  56.   
  57.                          ENDPOINT_NUMBER                           ENDPOINT_VALUE   
  58. ---------------------------------------- ----------------------------------------   
  59.                                        1                         123456789.123457   
  60.                                        6                         123456789.123457   
  61.                                        8                         123456799.123457   
 

    The endpoint value of the 1st & 2nd bucket are the same value. Let's explain the query to equal predict these values.
 

SQL代码
  1. HELLODBA.COM>set autot trace exp   
  2. HELLODBA.COM>select * from demo.htc5 where a=123456789.123456789;   
  3.   
  4. Execution Plan   
  5. ----------------------------------------------------------   
  6. Plan hash value: 4195264197   
  7.   
  8. --------------------------------------------------------------------------   
  9. | Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |   
  10. --------------------------------------------------------------------------   
  11. |   0 | SELECT STATEMENT  |      |     5 |    60 |     3   (0)| 00:00:01 |   
  12. |*  1 |  TABLE ACCESS FULL| HTC5 |     5 |    60 |     3   (0)| 00:00:01 |   
  13. --------------------------------------------------------------------------   
  14.   
  15. Predicate Information (identified by operation id):   
  16. ---------------------------------------------------   
  17.   
  18.    1 - filter("A"=123456789.123456789)   
  19.   
  20. HELLODBA.COM>select * from demo.htc5 where a=123456789.123456799;   
  21.   
  22. Execution Plan   
  23. ----------------------------------------------------------   
  24. Plan hash value: 4195264197   
  25.   
  26. --------------------------------------------------------------------------   
  27. | Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |   
  28. --------------------------------------------------------------------------   
  29. |   0 | SELECT STATEMENT  |      |     5 |    60 |     3   (0)| 00:00:01 |   
  30. |*  1 |  TABLE ACCESS FULL| HTC5 |     5 |    60 |     3   (0)| 00:00:01 |   
  31. --------------------------------------------------------------------------   
  32.   
  33. Predicate Information (identified by operation id):   
  34. ---------------------------------------------------   
  35.   
  36.    1 - filter("A"=123456789.123456799)   
  37.   
  38. HELLODBA.COM>select * from demo.htc5 where a=123456799.123456799;   
  39.   
  40. Execution Plan   
  41. ----------------------------------------------------------   
  42. Plan hash value: 4195264197   
  43.   
  44. --------------------------------------------------------------------------   
  45. | Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |   
  46. --------------------------------------------------------------------------   
  47. |   0 | SELECT STATEMENT  |      |     2 |    24 |     3   (0)| 00:00:01 |   
  48. |*  1 |  TABLE ACCESS FULL| HTC5 |     2 |    24 |     3   (0)| 00:00:01 |   
  49. --------------------------------------------------------------------------   
  50.   
  51. Predicate Information (identified by operation id):   
  52. ---------------------------------------------------   
  53.   
  54.    1 - filter("A"=123456799.123456799)   

    When estimate the number of rows of the value of the 1st bucket, the optimizer got the number of the 2nd bucket.
 

Date (DATE, TIMESTAMP and subtypes)

    Oracle also uses the original data to create histogram:
 

SQL代码
  1. select substrb(dump(val, 16, 0, 32), 1, 120) ep, cnt   
  2.   from (select /*+ no_parallel(t) no_parallel_index(t) dbms_stats cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring */   
  3.          "B" val, count(*) cnt   
  4.           from "DEMO"."HTC3" t   
  5.          where "B" is not null  
  6.          group by "B")   
  7.  order by val   

    To store the data as endpoint value, it needs convert the date to number. The rule is converting the value to days, both date & time parts. To convert time parts to day, it will convert it to seconds then divide 86400(24*60*60), which is the number of seconds of one day. After that, the number should also be ROUND to the 15th position before be stored.
 

SQL代码
  1. HELLODBA.COM>select endpoint_number,   
  2.   2         endpoint_value   
  3.   3    from dba_histograms   
  4.   4   where owner = 'DEMO'  
  5.   5     and table_name = 'HTC3'  
  6.   6     and column_name = 'B';   
  7.   
  8.                          ENDPOINT_NUMBER                           ENDPOINT_VALUE   
  9. ---------------------------------------- ----------------------------------------   
  10.                                        1                         2455538.00001157   
  11.                                        2                         2455538.00002315   
  12.                                        3                         2455538.00003472   
  13.                                        4                          2455538.0000463