Basics of the Cost Based Optimizer – Part 1

This series on Oracle’s Cost Based Optimizer is aimed at the less experienced DBAs and developers to help them understand what the optimizer is trying to achieve, how it arrives at an execution plan, why it makes mistakes, and (perhaps most importantly) how to recognize the source of those mistakes and so address the resulting problems in an appropriate fashion

This series on Oracle’s Cost Based Optimizer is aimed at the less experienced DBAs and developers to help them understand what the optimizer is trying to achieve, how it arrives at an execution plan, why it makes mistakes, and (perhaps most importantly) how to recognize the source of those mistakes and so address the resulting problems in an appropriate fashion.

I will try to avoid going into extreme technical detail though I will outline a few of the slightly more technical issues and mention a few URLs that go into greater depth on particular topics.

To get the best out of this series you will need to have some experience with reading execution plans, especially when we look at the “trouble-shooting” side of optimization.

Overview

Ignoring all the bits and pieces that relate to readability (of the code as well as the results) a basic SQL statement takes the form:

The task of the optimizer is to work out the fastest possible strategy for acquiring the raw data you need and massaging that data into the result set you want to see. To do this it relies on a numerical model that contains information of your data (the objects and their statistics), the machine performance (the system statistics), and the available machine resources (the database/instance configuration).

The model allows the optimizer to estimate how much data it will have to examine, how much work it will take to visit that data, how much of that data it actually needs, how much work it will take to compare sets of data for matches, how much data will survive or be generated by that comparison, and how much work it will take to sort and/or aggregate data.

The trouble with the optimizer is that the static model can never be good enough – even though Oracle Corp. is constantly enhancing the code to work around the inherent deficiencies of any static model – moreover the model doesn’t allow for the side effects of how the data is being used (which might actually be a good thing unless the way you use the data is absolutely, boringly, predictable). This means that it’s relatively easy to find cases where the optimizer says: “If I use strategy A it will take less than 1 second, if I use strategy B is will take 2 minutes” while you can see that strategy A takes 10 minutes whereas if you force Oracle to take strategy B you get a result in 5 seconds. When this happens there will be a reason that is usually easy to identify; and most frequently that reason will be that the optimizer doesn’t really understand what your data looks like or doesn’t realize that there’s a mechanical factor involved that allows some activity to complete more quickly than the optimizer had estimated.

Example:

Let’s use a very simple query to provide an informal description of how the optimizer “thinks”. Assume I have an order-processing system that has been running for the last 3 years, roughly doubling the number of orders placed each year and currently running at about 20,000 orders per week. Most orders are for a single product but a small percentage of the orders include 2 or 3 items (order lines) and a tiny fraction has 4 to 10 items. On average, therefore, each order has 1.2 order lines associated with it where each order line is for a single product.

Here’s some SQL to describe the tables and current indexing:

There are a couple of foreign key constraints commented out in this minimalist structure – I haven’t defined a products table or a customers table which would be the natural referential integrity targets for the order_lines.product_id and orders.customer_id.

So here’s a query we might run: “find the number of order lines, together with volume and value per day over the last week of all sales of a specific product”.

How does the optimizer approach this problem?

With our knowledge of the business we know that there are going to be about 20,000 orders in the relevant time period. Averaging across the three years, though, the optimizer will produce an estimate closer to 12,000. (Optional exercise 1: check that the latter statement is a reasonable approximation, then consider how inaccurate it still might be.)

We might know that product 101234 is very popular and appears on 30 order_lines per day; or we may know that it’s a high-cost item which is only ordered a couple of times per week. If it’s very popular and the statistics on the order_lines table included a (sufficiently lucky) histogram on the product_id then the optimizer might have enough information to get a good estimate, otherwise it may simply work out the average number of order_lines per product ordered and use that in its calculations. Let’s say the optimizer comes up with an estimate (based on averages) of 1,000 order lines for product 101234 across the three years.

What’s the next step? According to its basic estimates the optimizer has to compare 12,000 orders with 1,000 order lines, find the matches (orders in the last week for product 101234) and then do a little aggregation If we follow the optimizer’s arithmetic based on these numbers we can see that it’s going to come up with an estimate of 6 or 7 orders in the last week that include product 101234 – a tiny result set but we’ll have to acquire and eliminate a lot of data to get to it. (Optional exercise 2: check the description of the data and confirm that 6 or 7 orders per week is a reasonable estimate; then explain why there is a flaw in the logic that produces this estimate even if all the products are equally popular)

After “how much”, the next question is “where is it”. Understanding how the database and the order-processing system works we (the people) can assume that the orders for the last week will have gone into the table over the last few days and will be packed into the last few blocks of the table – and perhaps the numbers the optimizer has in its model (most significantly the clustering_factor on the ord_dt_ord index) may allow it to recognise that the (incorrectly estimated) 12,000 orders for the last week are packed into just 400 blocks in the table.

Conversely we (the people) know that the order lines for product 101234 will be spread over the entire 3 year trading history, and the numbers the optimizer has in its model (specifically the clustering_factor on the orl_fk_prd index) ought to tell it that every row we’re interested in will probably be in a different block in the table so the (estimated) 1,000 rows will be scattered across 1,000 blocks of the table which we, and the optimizer, could recognise as an indication that we will need to do 1,000 random physical reads to get the data.

From a human perspective the optimizer has to use this information to compare three possible strategies:

  1. collect all the “possibly relevant” orders and all the “possibly relevant” order_lines independently, then compare the two sets of data to find matches (there are basically two mechanisms to do this, the hash join and the merge join)
  2. collect all the orders for the last week and, as we acquire the data for each order in turn, find the matching order_lines rows (plural), discarding any data that isn’t about product 101234 (a nested loop join driven by orders)
  3. collect all the order lines for product 101234 and, as we acquire the data for each order line in turn, find the matching orders row (singular), discarding any data that isn’t about the last week (a nested loop driven by order_lines)

Option (1) hash/merge join

If we consider option (1) from the optimizer’s perspective – it has to decide how much work it will take to read all the order lines for product 101234 and all the orders for the last week. Having decided that each order line is in a different block in the table it will then consider two strategies: use the product_id index to find and read each block separately for a total of 1,000 table block reads (plus a handful of index block reads) or use a brute force method of scanning the entire table, reading every block and examining every row, discarding the rows it doesn’t need.

The choice is dictated by time – let’s assume that the optimizer has decided (through the system statistics that the typical multiblock read will turn out to be 32 consecutive blocks and take just twice as long as reading a single block, it will decide to do the tablescan if the table size is no more than 16,000 blocks (since 32/2 = 16, and the alternative is 1,000 single block reads). Similarly the optimizer may choose to do a tablescan of the orders table – but if it has decided that the indexed access path will require it to visit 400 blocks then it will only pick the tablescan if the size of the table is less than 16 * 400 = 6,400 blocks. (We will ignore, for the purposes of this exercise, the fact that there may be more options for getting the data from a single table than just the choice between a tablescan and a single indexed access path.)

I won’t go into the details of the hash join and merge join calculations, but for small data sets the extra cost of using a bit of memory and a little CPU time to execute the join is small.

Option (2) nested loop from order to order_lines

The optimizer thinks it is going to collect 12,000 orders so whatever it does next to find the matching order lines it thinks it will be doing it 12,000 times. That’s going to produce a very large total cost unless the thing it does next has an extremely low cost each time it happens. Unfortunately this is where the optimizer algorithms start to run into difficulties. The optimizer will calculate the cost of finding the order lines for a single order – starting from scratch the most efficient method it can come up with is to use the index that starts with the order_id and, according to its arithmetic, this will give it 1.2 order_lines which would require at least one random read from the order_lines table. This means the total cost of operating the nested loop join this way round will exceed 12,000 random single-block reads.

Of course, we (the people) know that the (predicted) 12,000 * 1.2 order lines will be very well packed in the last few blocks of the table, and so we might expect that once Oracle has read a few of the order lines it will have loaded all the necessary blocks into the buffer cache and won’t need to do any more random reads. This plan might be the fastest executing plan, but the optimizer might think it’s going to take much more time than we would expect.

In fact in a typical order-processing system it’s likely that most of the orders and order lines for the last few days will have been in the buffer cache before we started the query, so we may not have to do any physical reads at all. One of the commonest sources of “bad choice of plan” from the optimizer is that it doesn’t have any idea of how you use your data, so it doesn’t understand when it might get a lot of benefit from caching.

Option (3) nested loop from order_lines to orders

The argument is similar to a mixture of options (1) and (2). First the optimizer has do decide whether it’s going to do a tablescan or use an index-driven method to collect 1,000 order lines; then it has to work out how to find the matching order for each order line. In this case it only has to execute “find the order” 1,000 times so if it can work out a reasonable cheap way of finding an order the total cost of this path MAY be less than the cost of option (1). Unfortunately this could be a case where – even restricting ourselves to the nested loop options – the path with the lowest cost might be the path that takes the longest time.

The best way for Oracle to locate an order from an order line is to use the index on the order_id, and the optimizer will calculate that finding an order will require it to read one table block. In this case we’ve started with 1,000 order lines scattered randomly across time so the associated orders will also be scattered randomly across time so it’s highly likely that we will end up reading something like 1,000 + 1,000 = 2,000 table blocks to complete the whole query.

When I considered option (2) I said we would get a lot of benefit from the time-dependent caching of order_lines, on top of which we might only need to read a few leaf blocks from the indexes. When we start from the order_lines table we won’t get any time-dependent table caching benefit and we will probably have to read far more index leaf blocks from the orders index.

From the optimizer’s perspective the costs of finding the order line once from an order and the cost of finding the order once from an order line are virtually identical – but going one way the optimizer thinks it would have to do 12,000 searches while going the other way it would have to do 1,000 searches.

From our perspective we know that Oracle will get a massive caching benefit from the thing it expects to do 12,000 times and virtually no caching benefit from the thing it expects to do 1,000 times. In other words the optimizer will give option (2) a cost that is roughly 12 times the cost of option (3), while we expect option (3) to complete roughly 2.5 times slower then option (2) when we compare the 1,000 extra reads we predict for 1,000 random orders compared to something like 400 reads (worst case scenario) we might have to do to get 20,000 order lines into the buffer cache.

In the next instalment I’ll create a data set for this model, so that I can show you the plans we might get with a simple problem like this, and the strategies we can use to help Oracle identify the most appropriate execution plan.

Comments on exercises:

Exercise 1: (Oracle’s estimate of 12,000): We currently run at 20,000 orders per week after doubling every year over three years – so 5,000 per week in the first year, 10,000 per week in the second, and 20,000 per week in the third – for an average of about 35,000/3 orders per week over the period. (We should recognize that this still isn’t a realistic model, but it is a little better than thinking in terms of a simple flat average.)

Exercise 2: (estimate of 6 or 7 orders of product 101234 per week). The optimizer has no idea that sales have doubled every year (though a suitably partitioned table would allow it to capture that information) so a better model would suggest 143 orders for product 101234 in the first year, 286 orders in year 2, and 571 orders in year 3, which means more than 11 (actually close to 15 if we model smooth weekly growth rather than a very simple stepped annual growth) in the most recent week. Of course, in the real world a typical order processing system would also show seasonal variation for some products, and there would be a continuous churn of new products becoming available and old products being discontinued.

–> Catalogue of current articles in CBO series.