Cost Based Optimisation - 2

Add to Favourites
Post to:
Comments
Presentation Transcript Presentation Transcript

Optimisation and Efficiency - (pt 2) : Optimisation and Efficiency - (pt 2) Jonathan Lewis www.jlcomp.demon.co.uk jonathanlewis.wordpress.com

Who am I ? : Cost Based Oracle Basics 1b - 2 Jonathan Lewis ©2001-2010 Who am I ? Independent Consultant. 25+ years in IT 21+ using Oracle Strategy, Design, Review Briefings, Seminars Trouble-shooting Jonathanlewis.wordpress.com www.jlcomp.demon.co.uk Member of the Oak Table Network. Oracle ACE Director Oracle author of the year 2006 Select Editor’s choice 2007

Reprise - 1 : Cost Based Oracle Basics 1b - 3 Jonathan Lewis ©2001-2010 Reprise - 1 create table t1 as select trunc((rownum - 1)/15) n1, trunc((rownum - 1)/15) n2, rpad(rownum - 1,180) v1 from all_objects where rownum <= 3000; create table t2 as select mod(rownum - 1,200) n1, mod(rownum - 1,200) n2, rpad(rownum -1 ,180) v1 from all_objects where rownum <= 3000; In the first session we created these two sets of data - with identical content but with a different ordering, then created an index on the n1 column.

Reprise - 2 : Cost Based Oracle Basics 1b - 4 Jonathan Lewis ©2001-2010 Reprise - 2 If we drew a picture of the data distribution of the n1 column, we could see that it was well clustered in one table, and very widely scattered in the other

Reprise - 3 : Cost Based Oracle Basics 1b - 5 Jonathan Lewis ©2001-2010 Reprise - 3 select v1 from t2 where n1 = 45; . | Id | Operation | Name | Rows | Bytes | Cost | | 0 | SELECT STATEMENT | | 15 | 2775 | 14 | |* 1 | TABLE ACCESS FULL| T2 | 15 | 2775 | 14 | select v1 from t1 where n1 = 45; Because of this difference shows up in the clustering_factor of the index on (n1), the same query gets different execution plans from the two tables, db_file_multiblock_read_count = 8 system statistics disabled . | Id | Operation | Name | Rows | Bytes | Cost | | 0 | SELECT STATEMENT | | 15 | 2775 | 2 | | 1 | TABLE ACCESS BY INDEX ROWID| T1 | 15 | 2775 | 2 | |* 2 | INDEX RANGE SCAN | T1_I1 | 15 | | 1 |

Reprise - 4 : Cost Based Oracle Basics 1b - 6 Jonathan Lewis ©2001-2010 Reprise - 4 T2 by index one index block, 15 data blocks = 16 T2 or T1 by scan ceiling(82 blocks / 6.588 ) + 1 = 14 (this uses an adjusted db_file_multiblock_read_count) T1 by index one index block, one data block = 2 “io-costing" counts assumed physical read requests As an approximation for this case we see that the old costing method counts I/O requests to work out costs - ignoring any measure of time or caching

Multiblock Read Count : Cost Based Oracle Basics 1b - 7 Jonathan Lewis ©2001-2010 Multiblock Read Count select v1 from t2 where n1 = 45; -- tablescan cost was 14 alter session set db_file_multiblock_read_count = 4; . | Id | Operation | Name | Rows | Bytes | Cost | | 0 | SELECT STATEMENT | | 15 | 2775 | 16 | | 1 | TABLE ACCESS BY INDEX ROWID| T2 | 15 | 2775 | 16 | |* 2 | INDEX RANGE SCAN | T2_I1 | 15 | | 1 | (The tablescan cost would be 21) alter session set db_file_multiblock_read_count = 16; . | Id | Operation | Name | Rows | Bytes | Cost | | 0 | SELECT STATEMENT | | 15 | 2775 | 9 | |* 1 | TABLE ACCESS FULL| T2 | 15 | 2775 | 9 | The cost of a tablescan uses an 'adjusted' db_file_multiblock_read_count. And in 9i, _tablescan_cost_plus_one is set to true, biasing towards indexes.

Indexed-access adjustment : Cost Based Oracle Basics 1b - 8 Jonathan Lewis ©2001-2010 Indexed-access adjustment select * from t2 where n1 = 45; -- index access cost was 16 alter session set OPTIMIZER_INDEX_COST_ADJ = 50; alter session set OPTIMIZER_INDEX_COST_ADJ = 25; In older versions of Oracle we might fix one of Oracle's assumptions - tell it that index-driven single block reads are cheaper than multi-block reads. . | Id | Operation | Name | Rows | Bytes | Cost | | 0 | SELECT STATEMENT | | 15 | 2775 | 8 | | 1 | TABLE ACCESS BY INDEX ROWID| T2 | 15 | 2775 | 8 | |* 2 | INDEX RANGE SCAN | T2_I1 | 15 | | 1 | . | Id | Operation | Name | Rows | Bytes | Cost | | 0 | SELECT STATEMENT | | 15 | 2775 | 4 | | 1 | TABLE ACCESS BY INDEX ROWID| T2 | 15 | 2775 | 4 | |* 2 | INDEX RANGE SCAN | T2_I1 | 15 | | 1 |

OICA Threat : Cost Based Oracle Basics 1b - 9 Jonathan Lewis ©2001-2010 OICA Threat Execution Plan (Unhinted) . | Id | Operation | Name | Rows | Bytes | Cost | | 0 | SELECT STATEMENT | | 1 | 14 | 2 | |* 1 | TABLE ACCESS BY INDEX ROWID| T1 | 1 | 14 | 2 | |* 2 | INDEX RANGE SCAN | T1_GOOD | 4 | | 1 | Execution Plan (Hinted) . | Id | Operation | Name | Rows | Bytes | Cost | | 0 | SELECT STATEMENT | | 1 | 14 | 5 | |* 1 | TABLE ACCESS BY INDEX ROWID| T1 | 1 | 14 | 5 | |* 2 | INDEX RANGE SCAN | T1_BAD | 4 | | 1 | Execution Plan (Unhinted, but optimizer_index_cost_adj = 20) . | Id | Operation | Name | Rows | Bytes | Cost | | 0 | SELECT STATEMENT | | 1 | 14 | 1 | |* 1 | TABLE ACCESS BY INDEX ROWID| T1 | 1 | 14 | 1 | |* 2 | INDEX RANGE SCAN | T1_BAD | 4 | | 1 | See "Oh I see a problem." on www.jlcomp.demon.co.uk/ind_misc. OICA scales costs down. Due to rounding errors this can pick the wrong index.

System Statistics – setting : Cost Based Oracle Basics 1b - 10 Jonathan Lewis ©2001-2010 System Statistics – setting begin dbms_stats.set_system_stats('MBRC',6); dbms_stats.set_system_stats('MREADTIM',24); dbms_stats.set_system_stats('SREADTIM',10); dbms_stats.set_system_stats('CPUSPEED',700); end; begin dbms_stats.set_system_stats('MAXTHR', 1600000); dbms_stats.set_system_stats('SLAVETHR',400000); end; Two examples of setting system stats. The dbms_stats package allows us to set or get any stats. And we can be more intelligent about it than Oracle. Since Oracle 9i there has been a better option. This was optional in 9i but enabled by default from 10g onwards

System Statistics – using : Cost Based Oracle Basics 1b - 11 Jonathan Lewis ©2001-2010 System Statistics – using select * from t1; (t1 is 82 blocks) Plan with IO costing . | Id | Operation | Name | Rows | Bytes | Cost | | 0 | SELECT STATEMENT | | 3000 | 547K| 14 | | 1 | TABLE ACCESS FULL| T1 | 3000 | 547K| 14 | 14 = ceil(82/6.588) + 1 Plan with CPU costing . | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | | 0 | SELECT STATEMENT | | 3000 | 547K| 35 (0)| 00:00:01 | | 1 | TABLE ACCESS FULL| T1 | 3000 | 547K| 35 (0)| 00:00:01 | If we set the statistics from the previous page, we see how tablescans (in particular) change cost in 9.2 . CPU costing tends to favour indexes. 35 = ceil(ceil(82/6) * (24/10)) + 1 + bit of CPU = 33 + 1 + a bit of CPU = 34 + bit of CPU -- use of ceil() may be different

Index Caching (NL and INLIST) : Cost Based Oracle Basics 1b - 12 Jonathan Lewis ©2001-2010 Index Caching (NL and INLIST) select t2.n1, t1.n2 from t2,t1 where t2.n2 = 45 -- 15 rows from t2, and t1.n1 = t2.n1; -- 15 for each from t1 Execution plan (without CPU costing) . | Id | Operation | Name | Rows | Bytes | Cost | | 0 | SELECT STATEMENT | | 225 | 3600 | 29 | |* 1 | HASH JOIN | | 225 | 3600 | 29 | |* 2 | TABLE ACCESS FULL| T2 | 15 | 120 | 14 | | 3 | TABLE ACCESS FULL| T1 | 3000 | 24000 | 14 | 29 = 1 + 14 + 14 We also have a way of changing the cost of indexed access when it appears in nested loop joins or in-list iterators. Use our test tables in a simple join.

Index Caching (2) : Cost Based Oracle Basics 1b - 13 Jonathan Lewis ©2001-2010 Index Caching (2) select /*+ ordered use_nl(t1) index(t1(n1)) */ t2.n1, t1.n2 from t2,t1 where t2.n2 = 45 and t1.n1 = t2.n1; “Available” nested loop (without CPU costing) . | Id | Operation | Name | Rows | Bytes | Cost | | 0 | SELECT STATEMENT | | 225 | 3600 | 44 | | 1 | TABLE ACCESS BY INDEX ROWID| T1 | 15 | 120 | 2 | | 2 | NESTED LOOPS | | 225 | 3600 | 44 | |* 3 | TABLE ACCESS FULL | T2 | 15 | 120 | 14 | |* 4 | INDEX RANGE SCAN | T1_I1 | 15 | | 1 | With hinting, we can get a nested loop – but the cost is higher than the hash join cost. We can draw a picture of this plan (which is different from 8i)

Indexed Nested Loop : Cost Based Oracle Basics 1b - 14 Jonathan Lewis ©2001-2010 Indexed Nested Loop For each row in the first table, we do an indexed access (usually) into the second table 2. (Then 9i introduced “prefetching” in blocking queries).

Index Caching (3) : Cost Based Oracle Basics 1b - 15 Jonathan Lewis ©2001-2010 Index Caching (3) 8i NL plan with range scan SELECT STATEMENT Optimizer=ALL_ROWS NESTED LOOPS TABLE ACCESS (FULL) OF 'T2' TABLE ACCESS (BY INDEX ROWID) OF 'T1' INDEX (RANGE SCAN) OF 'T_I1' (NON-UNIQUE) The plan for a nested loop join with index range scan on the second table has changed over time. The original was an “intuitive” statement of our picture. 9i NL plan with range scan SELECT STATEMENT Optimizer=ALL_ROWS TABLE ACCESS (BY INDEX ROWID) OF 'T1' NESTED LOOPS TABLE ACCESS (FULL) OF 'T2' INDEX (RANGE SCAN) OF 'T_I1' (NON-UNIQUE)

Index Caching (4) : Cost Based Oracle Basics 1b - 16 Jonathan Lewis ©2001-2010 Index Caching (4) alter session set OPTIMIZER_INDEX_CACHING = 0; -- default . | 0 | SELECT STATEMENT | | 225 | 3600 | 44 | | 1 | TABLE ACCESS BY INDEX ROWID| T1 | 15 | 120 | 2 | | 2 | NESTED LOOPS | | 225 | 3600 | 44 | |* 3 | TABLE ACCESS FULL | T2 | 15 | 120 | 14 | |* 4 | INDEX RANGE SCAN | T1_I1 | 15 | | 1 | T2 - cost = 14 Estimated rows = 15 For each row from T2 we access T1 by complete key value T1 - Cost per access = 2 Cost for 15 accesses = 15 x 2 = 30 Total cost of query = cost of T2 + total cost of T1 = 14 + 30 = 44 The shape of the plan changed, but the numbers were not re-arranged to make the costs consistent. The arithmetic doesn’t allow for caching effects..

Index Caching (5) : Cost Based Oracle Basics 1b - 17 Jonathan Lewis ©2001-2010 Index Caching (5) alter session set OPTIMIZER_INDEX_CACHING = 100; -- extreme . | 0 | SELECT STATEMENT | | 225 | 3600 | 29 | | 1 | TABLE ACCESS BY INDEX ROWID| T1 | 15 | 120 | 1 | | 2 | NESTED LOOPS | | 225 | 3600 | 29 | |* 3 | TABLE ACCESS FULL | T2 | 15 | 120 | 14 | |* 4 | INDEX RANGE SCAN | T1_I1 | 15 | | | T2 - cost = 14 Estimated rows = 15 For each row from T2 we access T1 by complete key value T1 - Cost per access = 1 Cost for 15 accesses = 15 x 1 = 15 Total cost of query = cost of T2 + total cost of T1 = 14 + 15 = 29 Arguably with system statistics it is reasonable to supply caching indicators through OICA and OIC, but I prefer to avoid this as they have global effects

/*+ nlj_batching */ – 11g : Cost Based Oracle Basics 1b - 18 Jonathan Lewis ©2001-2010 /*+ nlj_batching */ – 11g select /*+ ordered use_nl(t1) index(t1(n1)) */ t2.n1, t1.n2 from t2,t1 where t2.n2 = 45 and t1.n1 = t2.n1; . | Id | Operation | Name | Rows | | 0 | SELECT STATEMENT | | 225 | | 1 | NESTED LOOPS | | | | 2 | NESTED LOOPS | | 225 | |* 3 | TABLE ACCESS FULL | T2 | 15 | |* 4 | INDEX RANGE SCAN | T1_I1 | 15 | | 5 | TABLE ACCESS BY INDEX ROWID| T1 | 15 | In 11g, Oracle then has a further plan which it calls ‘nlj batching’. Line 2 gets t1 rowids and t2 values then, for each t1 rowid, line 1 gets the t1 values.

What about … : Cost Based Oracle Basics 1b - 19 Jonathan Lewis ©2001-2010 What about … the blevel (branch blocks)? multi-column indexes ? range-based predicates scans ? unique indexes ? bitmap indexes ? Index Organized Tables (IOTs) ? ... " ... with overflows ? This was just a visually helpful introduction

General approximation: : Cost Based Oracle Basics 1b - 20 Jonathan Lewis ©2001-2010 General approximation: For: a single table accessed by B-tree index using an index range scan: blevel + ceil(selectivity[ix_sel] * leaf_blocks) + ceil(selectivity[tb_sel] * clustering_factor) I think the general formula for use of a B-tree index was first publicised in a presentation by Wolfgang Breitling (www.centrexcc.com) at IOUG-2002

General approximation - graphic : Cost Based Oracle Basics 1b - 21 Jonathan Lewis ©2001-2010 General approximation - graphic Fraction of Leaf blocks Fraction of "Clustering-Factor walk" The picture shows the formula. We walk down the root and branch blocks, along the leaf blocks, then do part of the walk used for the clustering_factor.

How did I do it ? : Cost Based Oracle Basics 1b - 22 Jonathan Lewis ©2001-2010 How did I do it ? (if blevel = 1 then 0 else blevel) + ceil(selectivity[ix_sel] * leaf_blocks) + ceil(selectivity[tb_sel] * clustering_factor) For equality on single column indexes: avg_leaf_blocks_per_key @ sel * leaf_blocks avg_data_blocks_per_key @ sel * clustering_factor Cost = avg_leaf_blocks_per_key + avg_data_blocks_per_key But I didn't use the formula, so why did my suggestion work ? My example was a special case - a small, single column index, with equality predicate.

Bitmap Index differences (a) : Cost Based Oracle Basics 1b - 23 Jonathan Lewis ©2001-2010 Bitmap Index differences (a) create bitmap index t_i1 on t1(n1); create bitmap index t_i2 on t2(n1); -- dbms_stats.gather_table_stats(...) -- For bitmap indexes the clustering_factor copies the number of entries in the index - for large indexes, some key values will have multiple entries. For example: bitmap indexes are different. Change our original example to use bitmap indexes and the collected statistics are identical for both indexes TAB NUM_ROWS DIST_CNT D_BLOCKS L_BLOCKS CL_FAC T1 200 200 1 1 200 T2 200 200 1 1 200

Bitmap Index differences (b) : Cost Based Oracle Basics 1b - 24 Jonathan Lewis ©2001-2010 Bitmap Index differences (b) select * from t2 where n1 = 45; . | Id | Operation | Name | Rows | Bytes | Cost | | 0 | SELECT STATEMENT | | 15 | 3315 | 5 | | 1 | TABLE ACCESS BY INDEX ROWID | T2 | 15 | 3315 | 5 | | 2 | BITMAP CONVERSION TO ROWIDS| | | | | |* 3 | BITMAP INDEX SINGLE VALUE | T_B2 | | | | select * from t1 where n1 = 45; . | Id | Operation | Name | Rows | Bytes | Cost | | 0 | SELECT STATEMENT | | 15 | 3315 | 5 | | 1 | TABLE ACCESS BY INDEX ROWID | T1 | 15 | 3315 | 5 | | 2 | BITMAP CONVERSION TO ROWIDS| | | | | |* 3 | BITMAP INDEX SINGLE VALUE | T_B1 | | | | Since the statistics are the same, the costs are the same, so the execution path is identical. (This is not a sensible example - we aim to combine bitmaps ).

Bitmap assumptions: : Cost Based Oracle Basics 1b - 25 Jonathan Lewis ©2001-2010 Bitmap assumptions: 80% of the data is packed. (Thanks, K.Gopalakrishnan) 20% of the data is randomly scattered. Special fudge factor of 1.1 for the two index figures Approximate Cost (for very simple cases) @ 0.8 * blocks in the table / distinct_keys + 0.2 * calculated cardinality + 1.1 * index block usage Cost = ceil(0.8 * 82/200) + ceil(0.2 * 15) + 1.1 @ 5 This is close for simple equalities. The 80/20 rule appeared around 8.1.7 - but can be reset to its earlier action of 100% packed by setting event 10170.

Conclusion : Cost Based Oracle Basics 1b - 26 Jonathan Lewis ©2001-2010 Conclusion The goal: How much data and where is it. Cost is time Time was largely a count of I/O requests System statistics try to correct for time Quality of indexes relates to table scatter Try to leave parameters at their defaults There are many “special cases”

Want to learn?

Sign up and browse through relevant courses.

Name:
Your Email:
Password:
Country:
Contact no:


Area code Number
Subjects you are interested in:
Word verification: (Enter the text as in image)


Sign Up Already a member? Sign In
I agree to WizIQ's User Agreement & Privacy Policy
AIOUG Speaker
All India Oracle Users Group
User
37 Members Recommend
73 Followers

Your Facebook Friends on WizIQ

Give live classes, create & sell online courses

Try it free Plans & Pricing

Connect