[中文]
Author:
fuyuncat
Source:
www.HelloDBA.com
Date:
2010-03-18 03:17:15
sort arithmetic
During sorting, data will first be input into a private memory area named sort area, and be sorted in memory. If the data to be sorted is too large to be handle once, the sorted data will be directly written (no caching) into temporary tablespace as a data set. Once all of data be written into disk, oracle will merge those data sets recursively, till to all data be merged in one data set.
Initial Runs
The process that sort data in-memory is called initial run. Up to 80~90% of the sort_area_size may be used for read buffers during an initial run, and the rest is used for write buffers. If we know how many data (considering output records, average length and block size, it could be estimated) need be sorted, we can estimate the number of initial runs by below formula.
Initial Runs = CEIL(SORT_DATA_BLOKCS/ROUND(0.8*SORT_AREA_SIZE/BLOCK_SIZE))
Merges
2 or more data sets could be merged once. The number data sets be merged once is called merge width. Merging is also be processed in sort area. To merge the data set, need first directly read them from disk. Just like MBRC, mutliple blocks could be read in one io, which is controlled by SMRC (sort_multiblock_read_count, who become a hidden parameter from 9i). However, the value of this parameter is not the actual SMRC, it actually is to limit the MAX SMRC. To merge data sets, at least 2 data set must be selected. And correspond to asynchronous io (disk_asynch_io=TRUE), 2 read buffers required. So, the actual SMRC could be calculated as below.
SMRC = MIN(ROUND(SORT_AREA_SIZE/(2*2*BLOCK_SIZE))-1, _sort_multiblock_read_count)
Then, the merge width could be calculated base on the SMRC,
merge width = ROUND(SORT_AREA_SIZE/(2*SMRC*BLOCK_SIZE))-1
It will keep merging, untill all data are merged in one data set. The process will be recursive. For the last data sets, if the number is less than merge width, they will be put into next round. We have below function to calculate the number of merges.
Note: All of the merges except the last one are named Intermediate runs.
SQL代码
- HELLODBA.COM>create or replace function sort_merges(init_runs number, width number)
- 2 return number
- 3 as
- 4 i number:=init_runs;
- 5 n number:=0;
- 6 begin
- 7 while i>width loop
- 8 n := n+floor(i/width);
- 9 i := floor(i/width)+ mod(i,width);
- 10 end loop;
- 11 if i > 0 then
- 12 n := n+1;
- 13 end if;
- 14 return n;
- 15 end;
- 16 /
- Function created.
Look into an example. Here we have 65536b sort area, 721 blocks estimated to be sorted with 8k block size. We can get the sorting statistics data by below query.
SQL代码
- HELLODBA.COM>select init_runs,SMRC,merge_width,sort_merges(init_runs,merge_width) merges from
- 2 (select init_runs,
- 3 SMRC,
- 4 round(sort_area_size / (2 * SMRC * block_size), 0) - 1 as merge_width
- 5 from (select sort_blocks, sort_area_size, block_size, ceil(sort_blocks / round(0.8 * sort_area_size / block_size, 0)) init_runs,
- 6 round(sort_area_size / (2 * 2 * block_size), 0) - 1 SMRC
- 7 from (select 721 as sort_blocks, 65536 as sort_area_size, 8192 as block_size from dual)));
- INIT_RUNS SMRC MERGE_WIDTH MERGES
- ---------- ---------- ----------- ----------
- 121 1 3 60
- 1 row selected.
new sort algorithm
From 10gR2, new sort algorithm is introduced, which controled by hidden parameter "_new_sort", default is true. Here is its improvement.
*In-memory sort.
Less momory required for those small data set sorting in memory.
*Merge Sort
Will save keys for those initial data sets, and will choose the data sets to be merged by comparing the keys. And you will see a new statistics entry from the trace file:
Comparisons while searching for key in-memory 120
The temporary blocks will be more possible to be re-used when mergeing, save disk blocks.
Trace
10031, 10032, 10033 event traces will help us to understand the sorting. Below is the trace contents of previous example.
SQL代码
- Recording run at 401ca9 for 6 blocks
- Recording run at 401cb0 for 6 blocks
- Recording run at 401cb6 for 6 blocks
- Recording run at 401d3c for 6 blocks
- Recording run at 401d42 for 6 blocks
- Recording run at 401d48 for 6 blocks
- ...
- ---- Sort Parameters ------------------------------
- sort_area_size 65536
- sort_area_retained_size 65536
- sort_multiblock_read_count 1
- max intermediate merge width 3
- Merging run at 401d1a for 1 blocks
- Merging run at 401c3a for 6 blocks
- Total number of blocks to read: 7 blocks
- Recording run at 401d1b for 7 blocks
- Merging run at 401c40 for 6 blocks
- Merging run at 401c46 for 6 blocks
- Merging run at 401c5c for 6 blocks
- Total number of blocks to read: 18 blocks
- Recording run at 401d22 for 17 blocks
- ...
- ---- Sort Statistics ------------------------------
- Initial runs 121
- Intermediate runs 59
- Number of merges 60
- Input records 47582
- Output records 47582
- Disk blocks 1st pass 721
- Total disk blocks used 896
- Total number of comparisons performed 651813
- Comparisons performed by in-memory sort 320660
- Comparisons performed during merge 331033
- Comparisons while searching for key in-memory 120
- Temp segments allocated 1
- Extents allocated 56
- Uses version 2 sort
- Uses asynchronous IO
- ---- Run Directory Statistics ----
- Run directory block reads (buffer cache) 240
- Block pins (for run directory) 1
- Block repins (for run directory) 239
- ---- Direct Write Statistics -----
- Write slot size 8192
- Write slots used during in-memory sort 2
- Number of direct writes 2884
- Num blocks written (with direct write) 2884
- Block pins (for sort records) 2884
- Waits for async writes 1731
- ---- Direct Read Statistics ------
- Size of read slots for merge phase 8192
- Number of read slots for merge phase 6
- Size of read slots for output 8192
- Number of read slots for output 8
- Number of direct sync reads 1498
- Number of blocks read synchronously 1498
- Number of direct async reads 1386
- Number of blocks read asynchronously 1386
- Waits for async reads 347
- ---- End of Sort Statistics -----------------------
--- Fuyuncat ---
Previous:TIPS: Build OUTLINE agilely | Next:Bloom Filter |
Return To this Category |