Execution Plans Part 6: Pushed Subqueries

In part 5 of this series we examined a couple of examples where you had to handle the “first child first” rule with a little care. In part 6 we continue this exploration to look at a general principle – the “pushed subquery” – where the “first child first” could lead you to the wrong conclusion. Access or Filter If

In part 5 of this series we examined a couple of examples where you had to handle the “first child first” rule with a little care. In part 6 we continue this exploration to look at a general principle – the “pushed subquery” – where the “first child first” could lead you to the wrong conclusion.

Access or Filter

If you look at the following two execution plans (from an 11.2.0.4 instance) their basic shapes are very similar, and it would be easy to assume that we should interpret the order of operation in both cases to be basically “bottom to top”.

In fact “first child first” is correct for the first plan which really does turn out to operate “bottom to top”; but “first child first” fails on the second plan – the shape of the plan has been distorted by the way that the optimizer presents a filter subquery that has been “pushed” – i.e. run at the earliest possible moment. It’s hard to tell that this has happened from the body of the plan, you really need to check the predicate section of the plan and maybe even refer back to the original statement to understand what’s happening.

Here, in the same order, are the predicate sections for the two plans:

Take note of the critical difference that appears for line 2 – where the operation is INDEX UNIQUE SCAN in both cases. The first plan reports only an access predicate (which includes a subquery) while the second plan shows both an access predicate (which consists of simple predicates against the index columns) and a filter predicate (which consists of a subquery).

For an index operation, the access predicates tell us how to identify the start and stop values for a “range” scan of the index (even when the range is unique), and the filter predicates tell us what tests we have to apply to each index entry we find between the start and stop points (inclusive). After seeing the difference in predicates we can interpret the plans as follows:

First plan:

We do an index range scan of mm_pk at line 5, visiting table min_max at line 4 to identify some rows, aggregating those rows at line 3 to find the maximum value for id_child; we then use that value at line 2 to help us do an index unique scan of mm_pk, visiting table min_max at line 1 to identify (at most) one row which we pass on as the result. The order of operation is: 5,4,3,2,1,0.

Second plan:

We do an index unique scan of mm_pk at line 2 using the supplied literal predicates then execute the existence test of the subquery by doing an index range scan of mm_pk at line 4 using the id_child value from the first index scan as an input to the second index scan; we follow this with a visit to table min_max at line 3 to see if a suitable row exists; if the existence test is satisfied we use the rowid we found in line 2 to visit table min_max at line 1, returning (at most) one row as the result set. The order of operation is 2,4,3,1,0

It’s probably easier to appreciate what’s going on if you see the original queries – please remember, though, that I wrote them to demonstrate a point with the shortest plan possible, not as a solution to a real problem, so don’t bother trying to work out what they’re supposed to achieve.

With the two queries in front of you it’s much easier to see that the first can do something efficient by running the subquery first and then using the result to drive the main query; while the second query has to start by running the main query, stopping each step of the way to run the subquery existence test.

You’ll notice that I‘ve used a couple of hints in the second query to stop the optimizer from doing any cunning transformations – I’ve used the no_unnest hint to block subquery unnesting (which would have turned the query into a join) then I’ve told the optimizer to run the subquery at the earliest possible opportunity (with the push_subq hint); it’s the latter hint that has affected the shape of the execution plan (i.e. the order of operation). If I change that hint to no_push_subq (to block subquery pushing) this is how the plan changes:

With this change the “first child first” rule works again: the filter operation at line 1 calls line 2 (its first child) to return rows, then calls line 4 (its second child) for each row to decide whether or not to keep the row. We can confirm this interpretation by checking the predicate section where we see that the text of the subquery is now visible as a filter predicate in line 1.

Note: when you pull execution plans from memory – v$sql_plan et. al. – the text of subquery predicates disappears; in this case the predicate would be reported as “filter( IS NOT NULL)” leaving you to imagine what it is that is not supposed to be null.

You can get a little help in checking for pushed subqueries by using the ‘outline’ option as the formatting parameter to the call to dbms_xplan as this will allow you to see if there are any push_subq() hints in the outline/SQL Plan baseline that would be associated with the plan.

This example shows you the potential benefit of pushing subqueries – when we blocked the push Oracle didn’t run the subquery until after it had visited the table and acquired a row that we weren’t going to need; on the other hand when we pushed the subquery Oracle ran it the moment we had the index entry available, thereby avoiding a redundant table visit.

Different shapes

There is one other pattern that’s worth reproducing, for the visual reference if nothing else; it’s the example of a pushed subquery that really does have to wait until after we’ve visited the table. Here’s an example of some SQL that would do this:

Note that the subquery is checking for a match on the status column – which is not part of the index, so the subquery can’t run until after Oracle has picked a row from the table in the main query. Here’s the plan:

As usual, when a filter subquery is pushed, the plan doesn’t show a FILTER operation line. If I changed the hint to no_push_subq the shape of this plan would match the one above with the visible FILTER operation and lines 1 and 2 moved one character to the right. As it is, there’s a strange distortion in the plan which makes it appear (according to “first child first”) as if the index range scan at line 2 is the first child of the table access in line 1, and the table access in line 3 is the second child. In fact, we have to read this plan in the order 2,1,4,3,0 – we do a range scan of mm_pk in line 2, visit table min_max at line 1, and for each row we acquire we execute the subquery by doing a unique scan of mm_pk at line 4 and visiting table min_max at line 3, sending the result row to line 0.

Conclusion

In all these examples where the plan fails to match with “first child first” we’ve been looking at pushed subqueries – and in some cases I’ve had to put an explicit hint into the SQL to create the type of plan I wanted to demonstrate. When you’re trying to interpret an execution plan for a statement containing several subqueries, watch out for the distortion to the “normal” shape that pushed subqueries can introduce. You can check for the threat by looking carefully through the predicate section to see if you can see some subquery text appearing as a filter( ) predicate, bearing in mind that if you’ve pulled the plan from memory (v$sql_plan, dbms_xplan.display_cursor() etc.) then the body of the subquery itself will have disappeared and there will be little to see in the brackets.

One final thought: OEM (or Grid Control, or Cloud Control) and probably any other graphic interface to Oracle will use the “first child first” rule to evaluate the order of operation so they’ll be getting it wrong when subqueries are being pushed. OEM, for example, reported the operation order of my last query as 2, 4, 3, 1, 0 (first child first) rather than 2, 1, 4, 3, 0.