[中文] Search Internet Search HelloDBA
Oracle Technology Site. email: fuyuncat@gmail.com  MSN: fuyuncat@hotmail.com

# CBO Statistics data gathering and internal arithmetic analysis -- Converting to endpoint value

[中文]

Author:  fuyuncat

Source:  www.HelloDBA.com

Date:  2010-01-13 03:06:07

Share to

## 3. Converting to endpoint value

If we draw the histogram, the endpoint value should be x axis, and the endpoint number should be y axis. For those columns with histogram, oracle will convert their values to the data type of endpoint value, say, number. We will discuss how does the endpoint number and endpoint value be stored.

### Endpoint Number

For Endpoint Number, I have concluded it in previous paper "Frequency or Height Balanced".

•   In Frequency Histogram, the Endpoint Number is the accumunltive records number of the values in sequence. For example, there are 2 records are "1", 1 record is "2", 5 records are "3" ..., then, the correspond buckets' Endpoint Number are 2, 3(2+1), 8(3+5);
•   In Height Balanced Histogram, the Endpoint Number is the number of average bucket. For example, value 1,2,3 are located in the 1st bucket; value 4 is a popular value, which occupied 2 buckets, who has been compressed as 1 bucket; value 5 is located in the 3rd bucket..., then, Endpoint Number of these 3 buckets are 1, 3(2+1), 4(3+1).

### Endpoing Value

We can find the data type of endpoint value from the dictionary table histgrm\$. It means all kinds of datatype will be converted to number finally.

SQL代码
1. HELLODBA.COM>desc histgrm\$
2.  Name          Null?    Type
3.  ------------- -------- -------------------------------------
4.  OBJ#          NOT NULL NUMBER
5.  COL#          NOT NULL NUMBER
6.  ROW#                   NUMBER
7.  BUCKET        NOT NULL NUMBER
8.  ENDPOINT      NOT NULL NUMBER
9.  INTCOL#       NOT NULL NUMBER
10.  EPVALUE                VARCHAR2(1000)
11.  SPARE1                 NUMBER
12.  SPARE2                 NUMBER

Let's look into the conversion one datatype by one.

Note: Data precision to be analyzed for creating histogram is not same as the endpoint value data precision, which will also be discussed below.

Let's prepare the test data first.

SQL代码
1. HELLODBA.COM>create table demo.htc3 (a number, b date, c raw(100), d varchar2(100), e clob, f rowid);
2.
3. Table created.
4.
5. HELLODBA.COM>insert into demo.htc3 values (1, to_date('2010-12-07 00:00:01''YYYY-MM-DD HH24:MI:SS'), '01''A''A''AAAxdYAAFAAAPJUAAA');
6.
7. 1 row created.
8.
9. HELLODBA.COM>insert into demo.htc3 values (2, to_date('2010-12-07 00:00:02''YYYY-MM-DD HH24:MI:SS'), '02''BB''B','AAAxdYAAFAAAPJUAAB');
10.
11. 1 row created.
12.
13. HELLODBA.COM>insert into demo.htc3 values (3, to_date('2010-12-07 00:00:03''YYYY-MM-DD HH24:MI:SS'), '03''CCC''C','AAAxdYAAFAAAPJUAAC');
14.
15. 1 row created.
16.
17. HELLODBA.COM>insert into demo.htc3 values (4, to_date('2010-12-07 00:00:04''YYYY-MM-DD HH24:MI:SS'), '04''DDDDD''D','AAAxdYAAFAAAPJUAAD');
18.
19. 1 row created.
20.
21. HELLODBA.COM>insert into demo.htc3 values (5, to_date('2010-12-07 00:00:05''YYYY-MM-DD HH24:MI:SS'), '05''EEEEEE''E','AAAxdYAAFAAAPJUAAE');
22.
23. 1 row created.
24.
25. HELLODBA.COM>insert into demo.htc3 values (6, to_date('2010-12-07 12:50:01''YYYY-MM-DD HH24:MI:SS'), '06''FFFFFFF''F','AAAxdYAAFAAAPJUAAF');
26.
27. 1 row created.
28.
29. HELLODBA.COM>insert into demo.htc3 values (7.654321, to_date('2010-12-07 12:50:02''YYYY-MM-DD HH24:MI:SS')+1, '07''FFFFFF1''G','AAAxdYAAFAAAPJUAAG');
30.
31. 1 row created.
32.
33. HELLODBA.COM>insert into demo.htc3 values (8.7654321, to_date('2010-12-07 12:50:03''YYYY-MM-DD HH24:MI:SS')+2, '08''FFFFFF2''H','AAAxdYAAFAAAPJUAAH');
34.
35. 1 row created.
36.
37. HELLODBA.COM>insert into demo.htc3 values (9.87654321, to_date('2010-12-07 12:50:04''YYYY-MM-DD HH24:MI:SS')+3, '09''FFFFFF3''I','AAAxdYAAFAAAPJUAAI');
38.
39. 1 row created.
40.
41. HELLODBA.COM>insert into demo.htc3 values (10.987654321, to_date('2010-12-07 12:50:05''YYYY-MM-DD HH24:MI:SS'), '0A''FFFFFFF''J','AAAxdYAAFAAAPJUAAJ');
42.
43. 1 row created.
44.
46.
47. 1 row created.
48.
50.
51. 1 row created.
52.
53. HELLODBA.COM>insert into demo.htc3 values (2.123456789123123456789123456789123456789E35, to_date('2010-12-07 12:50:59''YYYY-MM-DD HH24:MI:SS'), 'AC1265231212CDAC1265231212CDAC1265231212CDAC1265231212CDAC1265231212EF', lpad('A',35,'C')||'2', lpad('A',35,'C')||'2','AAAxdYAAFAAAPJUAAM');
54.
55. 1 row created.
56.
57. HELLODBA.COM>commit;
58.
59. Commit complete.
60.
61. HELLODBA.COM>set serveroutput on
62. HELLODBA.COM>begin
63.   2   dbms_output.enable(1000000);
64.   3   dbms_stats.set_param('TRACE',16383);
65.   4   dbms_stats.gather_table_stats('DEMO','HTC3',NULL,0,FALSE,'FOR ALL COLUMNS');
66.   5  end;
67.   6  /
68. ...

#### Number (NUMBER and the subtypes)

We can acknowledge oracle use the original data to group the bucket for histogram from the traced query.

SQL代码
1. select substrb(dump(val, 16, 0, 32), 1, 120) ep, cnt
2.   from (select /*+ no_parallel(t) no_parallel_index(t) dbms_stats cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring */
3.          "A" val, count(*) cnt
4.           from "DEMO"."HTC3" t
5.          where "A" is not null
6.          group by "A")
7.  order by val

To store a number value as endpoint value, oracle will ROUND it to the 15th position start from left end.

SQL代码
1. HELLODBA.COM>set numw 50
2. HELLODBA.COM>select endpoint_number,
3.   2         endpoint_value
4.   3    from dba_histograms
5.   4   where owner = 'DEMO'
6.   5     and table_name = 'HTC3'
7.   6     and column_name = 'A';
8.
9.                          ENDPOINT_NUMBER                           ENDPOINT_VALUE
10. ---------------------------------------- ----------------------------------------
11.                                        1                                        1
12.                                        2                                        2
13.                                        3                                        3
14.                                        4                                        4
15.                                        5                                        5
16.                                        6                                        6
17.                                        7                                 7.654321
18.                                        8                                8.7654321
19.                                        9                               9.87654321
20.                                       10                             10.987654321
21.                                       11       2123456789123120000000000000000000
22.                                       12       2123456789123120000000000000000000
23.                                       13     212345678912312000000000000000000000
24.
25. 13 rows selected.

Wo can calculate the endpoint value from actual data value with rounding.

SQL代码
1. HELLODBA.COM>select a, round(a,15-length(trunc(a))) from demo.htc3;
2.
3.                                        A             ROUND(A,15-LENGTH(TRUNC(A)))
4. ---------------------------------------- ----------------------------------------
5.                                        1                                        1
6.                                        2                                        2
7.                                        3                                        3
8.                                        4                                        4
9.                                        5                                        5
10.                                        6                                        6
11.                                 7.654321                                 7.654321
12.                                8.7654321                                8.7654321
13.                               9.87654321                               9.87654321
14.                             10.987654321                             10.987654321
15. 2123456789123123456789123456789123.45679       2123456789123120000000000000000000
16. 2123456789123123456789123456789123.45679       2123456789123120000000000000000000
17. 212345678912312345678912345678912345.679     212345678912312000000000000000000000
18.
19. 13 rows selected.

Because the difference of precision between data to create histogram and the endpoint value, there may be multiple buckets with same endpoint value. This will confuse the optimizer to estimate the cost. Look at below case.

SQL代码
1. HELLODBA.COM>create table demo.htc5 (a number);
2.
3. Table created.
4.
5. HELLODBA.COM>insert into demo.htc5 values(123456789.123456789);
6.
7. 1 row created.
8.
9. HELLODBA.COM>insert into demo.htc5 values(123456789.123456799);
10.
11. 1 row created.
12.
13. HELLODBA.COM>insert into demo.htc5 values(123456789.123456799);
14.
15. 1 row created.
16.
17. HELLODBA.COM>insert into demo.htc5 values(123456789.123456799);
18.
19. 1 row created.
20.
21. HELLODBA.COM>insert into demo.htc5 values(123456789.123456799);
22.
23. 1 row created.
24.
25. HELLODBA.COM>insert into demo.htc5 values(123456789.123456799);
26.
27. 1 row created.
28.
29. HELLODBA.COM>insert into demo.htc5 values(123456799.123456799);
30.
31. 1 row created.
32.
33. HELLODBA.COM>insert into demo.htc5 values(123456799.123456799);
34.
35. 1 row created.
36.
37. HELLODBA.COM>commit;
38.
39. Commit complete.
40.
41. HELLODBA.COM>begin
42.   2   dbms_output.enable(1000000);
43.   3   dbms_stats.set_param('TRACE',16383);
44.   4   dbms_stats.gather_table_stats('DEMO','HTC5',NULL,0,FALSE,'FOR ALL COLUMNS');
45.   5  end;
46.   6  /
47.
48. PL/SQL procedure successfully completed.
49.
50. HELLODBA.COM>select endpoint_number,
51.   2         endpoint_value
52.   3    from dba_histograms
53.   4   where owner = 'DEMO'
54.   5     and table_name = 'HTC5'
55.   6     and column_name = 'A';
56.
57.                          ENDPOINT_NUMBER                           ENDPOINT_VALUE
58. ---------------------------------------- ----------------------------------------
59.                                        1                         123456789.123457
60.                                        6                         123456789.123457
61.                                        8                         123456799.123457

The endpoint value of the 1st & 2nd bucket are the same value. Let's explain the query to equal predict these values.

SQL代码
1. HELLODBA.COM>set autot trace exp
2. HELLODBA.COM>select * from demo.htc5 where a=123456789.123456789;
3.
4. Execution Plan
5. ----------------------------------------------------------
6. Plan hash value: 4195264197
7.
8. --------------------------------------------------------------------------
9. | Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
10. --------------------------------------------------------------------------
11. |   0 | SELECT STATEMENT  |      |     5 |    60 |     3   (0)| 00:00:01 |
12. |*  1 |  TABLE ACCESS FULL| HTC5 |     5 |    60 |     3   (0)| 00:00:01 |
13. --------------------------------------------------------------------------
14.
15. Predicate Information (identified by operation id):
16. ---------------------------------------------------
17.
18.    1 - filter("A"=123456789.123456789)
19.
20. HELLODBA.COM>select * from demo.htc5 where a=123456789.123456799;
21.
22. Execution Plan
23. ----------------------------------------------------------
24. Plan hash value: 4195264197
25.
26. --------------------------------------------------------------------------
27. | Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
28. --------------------------------------------------------------------------
29. |   0 | SELECT STATEMENT  |      |     5 |    60 |     3   (0)| 00:00:01 |
30. |*  1 |  TABLE ACCESS FULL| HTC5 |     5 |    60 |     3   (0)| 00:00:01 |
31. --------------------------------------------------------------------------
32.
33. Predicate Information (identified by operation id):
34. ---------------------------------------------------
35.
36.    1 - filter("A"=123456789.123456799)
37.
38. HELLODBA.COM>select * from demo.htc5 where a=123456799.123456799;
39.
40. Execution Plan
41. ----------------------------------------------------------
42. Plan hash value: 4195264197
43.
44. --------------------------------------------------------------------------
45. | Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
46. --------------------------------------------------------------------------
47. |   0 | SELECT STATEMENT  |      |     2 |    24 |     3   (0)| 00:00:01 |
48. |*  1 |  TABLE ACCESS FULL| HTC5 |     2 |    24 |     3   (0)| 00:00:01 |
49. --------------------------------------------------------------------------
50.
51. Predicate Information (identified by operation id):
52. ---------------------------------------------------
53.
54.    1 - filter("A"=123456799.123456799)

When estimate the number of rows of the value of the 1st bucket, the optimizer got the number of the 2nd bucket.

#### Date (DATE, TIMESTAMP and subtypes)

Oracle also uses the original data to create histogram:

SQL代码
1. select substrb(dump(val, 16, 0, 32), 1, 120) ep, cnt
2.   from (select /*+ no_parallel(t) no_parallel_index(t) dbms_stats cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring */
3.          "B" val, count(*) cnt
4.           from "DEMO"."HTC3" t
5.          where "B" is not null
6.          group by "B")
7.  order by val

To store the data as endpoint value, it needs convert the date to number. The rule is converting the value to days, both date & time parts. To convert time parts to day, it will convert it to seconds then divide 86400(24*60*60), which is the number of seconds of one day. After that, the number should also be ROUND to the 15th position before be stored.

SQL代码
1. HELLODBA.COM>select endpoint_number,
2.   2         endpoint_value
3.   3    from dba_histograms
4.   4   where owner = 'DEMO'
5.   5     and table_name = 'HTC3'
6.   6     and column_name = 'B';
7.
8.                          ENDPOINT_NUMBER                           ENDPOINT_VALUE
9. ---------------------------------------- ----------------------------------------
10.                                        1                         2455538.00001157
11.                                        2                         2455538.00002315
12.                                        3                         2455538.00003472
13.                                        4                          2455538.0000463