In part 8 we looked at a very simple execution plan involving a hash join; but that example was too simple to give us the full flavour of the arithmetic involved in Oracle’s predictions because every operation executed just once. We need to see some plans where each execution of a parent operation could requires multiple executions of its child operations; and that’s what we’re going to do in parts 9 and 10.

Getting started

Thanks to the constant evolution of cost based query transformation (CBQT) it’s surprisingly hard to find an example of a simple execution plan that actually displays the information that it “really ought to”, so I’ll have to show you a couple of examples where the way the numbers are displayed is inappropriate, and then explain the history behind the anomalies. In this article we’ll look at a simple nested loop join (using 11.2.0.4):

select
	t1.id, t2.id
from
	t2, t1
where
	t2.n1 = 15
and	t1.n1 = t2.n2
;

---------------------------------------------------------------------------------------
| Id  | Operation                     | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |       |   225 |  4500 |    46   (0)| 00:00:01 |
|   1 |  NESTED LOOPS                 |       |   225 |  4500 |    46   (0)| 00:00:01 |
|   2 |   NESTED LOOPS                |       |   225 |  4500 |    46   (0)| 00:00:01 |
|   3 |    TABLE ACCESS BY INDEX ROWID| T2    |    15 |   180 |    16   (0)| 00:00:01 |
|*  4 |     INDEX RANGE SCAN          | T2_I1 |    15 |       |     1   (0)| 00:00:01 |
|*  5 |    INDEX RANGE SCAN           | T1_I1 |    15 |       |     1   (0)| 00:00:01 |
|   6 |   TABLE ACCESS BY INDEX ROWID | T1    |    15 |   120 |     2   (0)| 00:00:01 |
---------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   4 - access("T2"."N1"=15)
   5 - access("T1"."N1"="T2"."N2")

As you can see, this plan has taken the new ‘nlj_batching’ strategy which shows up in the doubling of the nested loop operation. The order of operation is 4, 3, 5, 2, 6, 1, 0: We do an index range scan of t2_i1 to identify rows from t2, for each row we acquire from t2 we do an index range scan of t1_i1, then for each rowid we fetch from t1_i1 we visit t1 to get the row. Essentially the predicted row counts are correct – my data set will supply 15 rows from t2, and for my data set each row in t2 will join to 15 row in t1 for a total result set of 225 rows. But what do the figures actually tell us ?

I’m going to ignore the Bytes column – we learned all we need to know about what it’s trying to say, and how badly it says it, in the last installment. I’m also going to ignore the Time column since we know that it is derived from the Cost column multiplied by the sreadtim from sys.aux_stats$.

Lines 4 and 3 tell use we pick up 15 rowids then pick up 15 rows; the costs are 1 and 16 respectively – one leaf block physical read then (remembering that we accumulate up the plan from child to parent) 15 table block reads to visit the table plus the one leaf block read for a total cost of 16.

Line 5 says we do an index range scan of t1_i1 at a cost of 1 to pick up 15 rows (and yes, each time we do it for my data set we do visit just one leaf block and find 15 rowids). Those figures, of course, are the prediction for one execution of line 5 – but when the query runs the optimizer isn’t expecting line 5 to be executed just once.

Hitting Problems

We (and the optimizer) expect to do that index range scan 15 times in total and that’s why the nested loop join at line 2 tells us (correctly) that it will generate 225 rows, but where does the cost of 46 come from? The answer is simple: it’s a bug, more or less, that reflects history. The cost at line 3 ought to say 31 because (accumulating from the children up – and remembering to allow for multiple child executions) it’s: 16 (one execution of line 3) + 1 (one execution of line 5) * 15 (because we expect 15 executions thanks to the 15 rows predicted in line 3).

Suspend the question “why 46 instead of 31?” for a moment and go on to line 6 – which is completely wrong. For each of the 225 items generated in line 2 we will execute line 6 to find one row in table t1 (not the fifteen rows predicted); and an appropriate cost of a single execution of table access by rowid is one (not the two predicted).

There’s a reason why the figures at this point are inappropriate: they were designed to reflect the execution strategy that the optimizer used to produce in Oracle 8.1 and earlier, which would have looked like this (using utlxpls.sql on a very old version Oracle):

--------------------------------------------------------------------------------
| Operation                 |  Name    |  Rows | Bytes|  Cost  | Pstart| Pstop |
--------------------------------------------------------------------------------
| SELECT STATEMENT          |          |   225 |    4K|     35 |       |       |
|  NESTED LOOPS             |          |   225 |    4K|     35 |       |       |
|   TABLE ACCESS BY INDEX RO|T2        |    15 |  180 |      5 |       |       |
|    INDEX RANGE SCAN       |T2_I1     |    15 |      |      1 |       |       |
|   TABLE ACCESS BY INDEX RO|T1        |     3K|   23K|      2 |       |       |
|    INDEX RANGE SCAN       |T1_I1     |     3K|      |      1 |       |       |
--------------------------------------------------------------------------------

It’s unfortunate that in this case the Rows values for the access to t1 and its index are completely wrong, but you can appreciate that this plan is telling us that “for each row we return from t2 we do an index range scan of t1_i1 to get (15, rather than the 3,000) rows from t1 and that the cost of the index range is 1 (one leaf block) while the cost of the table access (including its child index access) is 2 (the data in t1 happens to be very well clustered). In this case it’s clear, then, that the cost of the nested loop is: 5 (t2 access) + 2 (t1 access) * 15 (predicted executions of t1 access) = 35.

So, line 6 in our original plan is reporting (with bug fixes) 8i numbers for the operations, even though the shape of the execution plan has changed and the operational mechanism is different. Technically we could argue that line 6 should look like this:

---------------------------------------------------------------------------------------
| Id  | Operation                     | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------
|   6 |   TABLE ACCESS BY INDEX ROWID | T1    |     1 |     8 |     1   (0)| 00:00:01 |
---------------------------------------------------------------------------------------

Each time we execute it, we will acquire one row, and we will visit one block. Of course, we will be executing this line 225 times, which means that applying the rule “accumulate children up to their parent” will produce a ridiculous cost – we’re caught in a trap between the optimizer doing its traditional arithmetic to determine the resource requirement and the execution plan showing the strategy the execution engine will actually take. Arguably we have a choice between the plan that Oracle shows us, and the following alternative which is, in some respects, a little more truthful:

---------------------------------------------------------------------------------------
| Id  | Operation                     | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |       |   225 |  4500 |    46   (0)| 00:00:01 |
|   1 |  NESTED LOOPS                 |       |   225 |  4500 |    46   (0)| 00:00:01 |
|   2 |   NESTED LOOPS                |       |   225 |  2700 |    31   (0)| 00:00:01 |
|   3 |    TABLE ACCESS BY INDEX ROWID| T2    |    15 |   180 |    16   (0)| 00:00:01 |
|*  4 |     INDEX RANGE SCAN          | T2_I1 |    15 |       |     1   (0)| 00:00:01 |
|*  5 |    INDEX RANGE SCAN           | T1_I1 |    15 |       |     1   (0)| 00:00:01 |
|   6 |   TABLE ACCESS BY INDEX ROWID | T1    |     1 |     8 |     1   (0)| 00:00:01 |
---------------------------------------------------------------------------------------

In this version of the plan I’ve adjusted the cost (and bytes) in line 2, to reflect the execution of the index range on t1_i1; and I’ve corrected the rows and cost (and bytes) in line 6 to reflect the execution of the table access.

What I haven’t done is change the cost in line 1 even though you could argue that the cost should be 256 (calculated as 31 + 225 * 1). The trouble is that the cost of 1 in line 6 isn’t truthful. Because the data in t1 is very well clustered we know we’re not going to read a block on every execution of line 6, we may read a block for the first row of a set of 15 but we can be confident that the next 14 executions will visit the same block in memory – on average the cost of acquiring one row from t1 by rowid should be 1/15. (Note: in fact the optimizer does do arithmetic to several decimal places, but rounds the figures for reporting in the execution plan; this is why you can sometimes see figures that suggest things like 2  * 3 = 5, internally it might have been 1.8 * 2.6 = 4.68.)

As this example shows there’s a presentation problem associated with execution plans. The optimizer tries to work out a resource cost and a mechanical strategy for getting the data you want, but the way the resource cost is calculated isn’t always visually compatible with the shape of the execution plan; so when you try to create some simple rules in an attempt to understand the numbers you will find some anomalies. When the basic strategy of “accumulate the children to the parent” (including the bit about multiplying up correctly) doesn’t make sense a little history and a little flexibility may give you a clue about how to re-interpret the numbers.

 

Tags: