So far in this series we’ve talked about interpreting the shape of an execution plan and understanding the meaning of the predictions that the optimizer has made about cost and cardinality. It’s finally time to see how Oracle gives us execution plans that show us how well the optimizer’s estimates match the actual work done as the query ran.

There are three main strategies that we could use to view this information – enabling “rowsource execution statistics” then using a call to dbms_xplan.display_cursor() to display the stats; enabling sql_trace (which also enables rowsource execution statistics) and reviewing the stats in the tkprof summaries or even the trace files themselves; or, for those who have licensed the diagnostic and performance packs, enabling SQL Monitoring and viewing the results (combined with some ASH (v$active_session_history) information through one of the calls to dbms_sqltune or through the Enterprise Manager SQL Monitoring screen. In this article we will look at the dbms_xplan.display_cursor() option.

Rowsource Execution Statistics

As a statement executes, Oracle is able to accumulate information about the amount of work done in each operation in the plan, including the number of times each operation has been called, the number of buffers visited, whether the visits were in consistent or current mode, the number of physical reads and (logical) writes, the total time spent in each operation and various bits of information about workarea usage. This information is visible in the dynamic performance view called v$sql_plan_statistics_all, which, in 11.2.0.4, has the following description:

 1:	ADDRESS                                   RAW(8)
 2:	HASH_VALUE                                NUMBER
 3:	SQL_ID                                    VARCHAR2(13)
 4:	PLAN_HASH_VALUE                           NUMBER
 5:	CHILD_ADDRESS                             RAW(8)
 6:	CHILD_NUMBER                              NUMBER
 7:	TIMESTAMP                                 DATE
 8:	OPERATION                                 VARCHAR2(30)
 9:	OPTIONS                                   VARCHAR2(30)
10:	OBJECT_NODE                               VARCHAR2(40)
11:	OBJECT#                                   NUMBER
12:	OBJECT_OWNER                              VARCHAR2(30)
13:	OBJECT_NAME                               VARCHAR2(30)
14:	OBJECT_ALIAS                              VARCHAR2(65)
15:	OBJECT_TYPE                               VARCHAR2(20)
16:	OPTIMIZER                                 VARCHAR2(20)
17:	ID                                        NUMBER
18:	PARENT_ID                                 NUMBER
19:	DEPTH                                     NUMBER
20:	POSITION                                  NUMBER
21:	SEARCH_COLUMNS                            NUMBER
22:	COST                                      NUMBER
23:	CARDINALITY                               NUMBER
24:	BYTES                                     NUMBER
25:	OTHER_TAG                                 VARCHAR2(35)
26:	PARTITION_START                           VARCHAR2(64)
27:	PARTITION_STOP                            VARCHAR2(64)
28:	PARTITION_ID                              NUMBER
29:	OTHER                                     VARCHAR2(4000)
30:	DISTRIBUTION                              VARCHAR2(20)
31:	CPU_COST                                  NUMBER
32:	IO_COST                                   NUMBER
33:	TEMP_SPACE                                NUMBER
34:	ACCESS_PREDICATES                         VARCHAR2(4000)
35:	FILTER_PREDICATES                         VARCHAR2(4000)
36:	PROJECTION                                VARCHAR2(4000)
37:	TIME                                      NUMBER
38:	QBLOCK_NAME                               VARCHAR2(30)
39:	REMARKS                                   VARCHAR2(4000)
40:	OTHER_XML                                 CLOB

41:	EXECUTIONS                                NUMBER
42:	LAST_STARTS                               NUMBER
43:	STARTS                                    NUMBER
44:	LAST_OUTPUT_ROWS                          NUMBER
45:	OUTPUT_ROWS                               NUMBER
46:	LAST_CR_BUFFER_GETS                       NUMBER
47:	CR_BUFFER_GETS                            NUMBER
48:	LAST_CU_BUFFER_GETS                       NUMBER
49:	CU_BUFFER_GETS                            NUMBER
50:	LAST_DISK_READS                           NUMBER
51:	DISK_READS                                NUMBER
52:	LAST_DISK_WRITES                          NUMBER
53:	DISK_WRITES                               NUMBER
54:	LAST_ELAPSED_TIME                         NUMBER
55:	ELAPSED_TIME                              NUMBER

56:	POLICY                                    VARCHAR2(10)
57:	ESTIMATED_OPTIMAL_SIZE                    NUMBER
58:	ESTIMATED_ONEPASS_SIZE                    NUMBER
59:	LAST_MEMORY_USED                          NUMBER
60:	LAST_EXECUTION                            VARCHAR2(10)
61:	LAST_DEGREE                               NUMBER
62:	TOTAL_EXECUTIONS                          NUMBER
63:	OPTIMAL_EXECUTIONS                        NUMBER
64:	ONEPASS_EXECUTIONS                        NUMBER
65:	MULTIPASSES_EXECUTIONS                    NUMBER
66:	ACTIVE_TIME                               NUMBER
67:	MAX_TEMPSEG_SIZE                          NUMBER
68:	LAST_TEMPSEG_SIZE                         NUMBER

If you check through this list of columns you’ll notice that the first 40 columns are just repeating v$sql_plan – the basic in-memory execution plan information – then lines 41 – 55 are the non-key columns from v$sql_plan_statistics, and lines 56 – 68 are from v$sql_workarea. The appearance of the last set of columns, of course, means that a query against v$sql_plan_statistics_all will report some useful information (the work area information) even when you haven’t enabled rowsource execution statistics.

Workarea

Before we pursue v$sql_plan_statistics_all, it’s worth making a couple of comments about how we can take advantage of the information available in v$sql_workarea because it’s a view that’s often overlooked in troubleshooting. Let’s start with a simple query against the sys schema, followed by a call to dbms_xplan.display_cursor() – when we haven’t enabled rowsource execution statistics:

alter session set statistics_level = typical;

select ct, count(*) from (
	select obj#, count(*) ct from sys.source$ group by obj#
) 
group by ct order by ct

select * from table(dbms_xplan.display_cursor(null,null,'allstats last'))

SQL_ID  33vchu53wj57a, child number 0
-------------------------------------
select ct, count(*) from (select obj#, count(*) ct from sys.source$
group by obj#) group by ct order by ct

Plan hash value: 2879348764

---------------------------------------------------------------------------------
| Id  | Operation               | Name      | E-Rows |  OMem |  1Mem | Used-Mem |
---------------------------------------------------------------------------------
|   0 | SELECT STATEMENT        |           |        |       |       |          |
|   1 |  SORT GROUP BY          |           |   5513 | 36864 | 36864 |32768  (0)|
|   2 |   VIEW                  |           |   5513 |       |       |          |
|   3 |    HASH GROUP BY        |           |   5513 |    37M|  8100K| 2719K (0)|
|   4 |     INDEX FAST FULL SCAN| I_SOURCE1 |    670K|       |       |          |
---------------------------------------------------------------------------------

Note
-----
   - Warning: basic plan statistics not available. These are only collected when:
       * hint 'gather_plan_statistics' is used for the statement or
       * parameter 'statistics_level' is set to 'ALL', at session or system level

In the call to display_cursor() I’ve used the parameter values (null, null, ‘allstats last’). This should report the rowsource execution statistics for the last execution of the last statement I’ve run – but I’ve set statistics_level to “typical” (the default) so there are no execution statistics available other than those related to wokarea (PGA) usage. As a side effect of this the Note section of the output contains a warning that the statistics are not available then tells us about the two (official) methods of enabling them (there is a third, which is to set the hidden parameter _rowsource_execution_statistics to true).

Reminder: if you are using dbms_xplan.display_cursor() to report the last plan of the last statement you executed from SQL*Plus, don’t forget that you need to “set serveroutput off” before you begin, otherwise the last statement will be dbms_output.get_lines() and Oracle will complain that it “cannot fetch plan for SQL_ID: 9babjv8yq8ru3”. (A similar complaint – different SQL_ID – may arise occasionally, and apparently randomly, if the statement you are investigating had to be re-optimised during the test.)

For the “last” execution option the workarea stats we get are:

  • OMem:  The memory Oracle thinks we would need to operate completely in memory.
  • 1Mem:  The memory above which we could still manage to do a one-pass workarea operation
  • Used-Mem:  The actual memory we used on the last execution

If we change the call to dbms_xplan to use ‘allstats’ rather than ‘allstats last’, we get a summary of activity for the child cursor since it was first executed; in my case:

---------------------------------------------------------------------------------
| Id  | Operation               | Name      | E-Rows |  OMem |  1Mem |  O/1/M   |
---------------------------------------------------------------------------------
|   0 | SELECT STATEMENT        |           |        |       |       |          |
|   1 |  SORT GROUP BY          |           |   5513 | 36864 | 36864 |     2/0/0|
|   2 |   VIEW                  |           |   5513 |       |       |          |
|   3 |    HASH GROUP BY        |           |   5513 |    37M|  8100K|     2/0/0|
|   4 |     INDEX FAST FULL SCAN| I_SOURCE1 |    670K|       |       |          |
---------------------------------------------------------------------------------

The final column now reports the number of times each operation performed as an optimal (in-memory) execution, how many as a one-pass, and how many went multi-pass. In this case (having run the query twice) I had two optimal executions, and no operations spilling to disc. Unfortunately my version of Oracle seemed to have some bugs in this area showing (for example) a blank entry for the O/1/M column at line 3 when it had done a couple of one-pass executions for the “hash group by”. When this happened, though, the plan did give some clue about the spill to disc because it reported the amount of disc space used in a Used-Tmp column (with the “last” option) or a Max-Tmp column. (In passing, the “temporary space” columns report the number of KB used but, unlike the “memory” columns, don’t include a K to remind you of the fact.)

To take best advantage of the v$sql_workarea view in combination with dbms_xplan.display_cursor() you can query the view to find interesting statement with a query like:

select
        sql_id, child_number, estimated_optimal_size,
        total_executions, optimal_executions, onepass_executions, multipasses_executions,
        max_tempseg_size
from
        v$sql_workarea
where
        max_tempseg_size > 65535
or      onepass_executions > 10
or      multipasses_executions > 0
;

After running this query you can use any reported sql_ids and child_numbers in a call to dbms_xplan.display_cursor() to get the SQL and plan for any statements that might be imposing a large strain on your system resources. Another strategy (if you are prompt enough) would be to pass in the sql_id (with an explicit null for the child number) for any interesting SQL ids that you have picked from a very recent AWR or Statspack report – with a little luck the plan and workarea stats might still be in memory. It’s important to remember, though, that this query will be doing a brute-force scan through your entire library cache, so it’s not a query you should run frequently, especially if you have a large SGA, the overheads on latch and mutex acquisition could be significant. (I probably wouldn’t run it more than four or five times per hour.)

Execution Stats

When I enabled rowsource execution statistics by setting statistics_level to all there were two significant changes to my tests. First the execution time jumped from 0.12 seconds to 3.75 seconds (most of it CPU); secondly the output of dbms_xplan.display_cursor() produced a lot more columns:

------------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation               | Name      | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  | Writes |  OMem |  1Mem | Used-Mem | Used-Tmp|
------------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT        |           |      1 |        |    563 |00:00:03.73 |    1623 |   1632 |     16 |       |       |          |         |
|   1 |  SORT GROUP BY          |           |      1 |   5513 |    563 |00:00:03.73 |    1623 |   1632 |     16 | 37888 | 37888 |32768  (0)|         |
|   2 |   VIEW                  |           |      1 |   5513 |   5514 |00:00:03.71 |    1623 |   1632 |     16 |       |       |          |         |
|   3 |    HASH GROUP BY        |           |      1 |   5513 |   5514 |00:00:03.69 |    1623 |   1632 |     16 |    36M|  8304K|  421K (0)|    1024 |
|   4 |     INDEX FAST FULL SCAN| I_SOURCE1 |      1 |    670K|    670K|00:00:01.84 |    1623 |   1616 |      0 |       |       |          |         |
------------------------------------------------------------------------------------------------------------------------------------------------------

The extra CPU is platform dependent and relates the time spent on calls to the O/S timing functions; and for this reason you may want to do a little testing for your own platform and may want to switch to an alternative mechanism for enabling the stats. As we saw in the ‘Note’ section of an earlier plan, adding the hint /*+ gather_plan_statistics */ to a query will enable the collection; but when the stats are driven by the hint the timing is done by sampling which, in my case, meant the query took 0.18 seconds (compared to 0.12 or 3.75 seconds). There is a catch, of course; here’s the final output hinted when I set the statistics_level back to typical and hinted the stats collection:

------------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation               | Name      | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  | Writes |  OMem |  1Mem | Used-Mem | Used-Tmp|
------------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT        |           |      1 |        |    563 |00:00:00.15 |    1623 |     42 |     16 |       |       |          |         |
|   1 |  SORT GROUP BY          |           |      1 |   5513 |    563 |00:00:00.15 |    1623 |     42 |     16 | 37888 | 37888 |32768  (0)|         |
|   2 |   VIEW                  |           |      1 |   5513 |   5514 |00:00:00.21 |    1623 |     42 |     16 |       |       |          |         |
|   3 |    HASH GROUP BY        |           |      1 |   5513 |   5514 |00:00:00.17 |    1623 |     42 |     16 |    36M|  8304K|  431K (0)|    1024 |
|   4 |     INDEX FAST FULL SCAN| I_SOURCE1 |      1 |    670K|    670K|00:00:02.31 |    1623 |     26 |      0 |       |       |          |         |
------------------------------------------------------------------------------------------------------------------------------------------------------

The change in Reads is a side effect of normal data caching, but look at the A-Time column in these two examples – this is supposed to accumulate up the plan from child to parent and something has clearly gone wrong. This plan is very simple – each operation calls the one below it – so we should see a simple accumulation up the plan; but the index fast full scan (the first operation to produce a rowsource) reports 2.31 seconds spent in a query that totaled 0.18 seconds; and the accumulated times go up and down as we work up the plan. If you use the /*+ gather_plan_statistics */ hint to enable rowsource execution statistics you cannot trust the accuracy of the A-Time column.

We jumped the gun a little with the explanation of the A-Time column, but here’s a reference to the rest of the columns relating to execution statistics:

  • Starts:  The number of times this operation actually occurred
  • E-rows:  Estimated rows (per execution of the operation) – i.e. the “Rows” column from a call to display()
  • A-rows:  The accumulated number of rows forwarded by this operation
  • A-time:  The accumulated time spent in this operation – including time spent in its descendents.
  • Buffers:  Accumulated buffer visits made by this operation – including its descendents.
  • Reads:  Accumulated number of blocks read from disc by this operation – including its descendents.
  • Writes:  Accumulated number of blocks written to disc by this operation – including its descendents.

There are two critical points to remember when interpreting a plan with execution stats; the first is understanding what it means to say that Oracle ‘accumulates’ up the plan, the second is the significance of the Starts column.

Interpretation

The sample SQL I’ve supplied for this article isn’t ideal for helping us to appreciate either point – and I’ll be using better examples in the next couple of articles – but we can at least see some signs of the accumulation feature.

One of the better rules of thumb for reading execution plans is the sound-bite ‘first child first’ – a parent operation calls each of its child operations in turn, then allows for recursive descent as each child may call its direct descendants before returning rows to the parent. With this in mind it’s easy to appreciate that the work done ‘by an operation’ consists of the work the operation does in its own right (e.g. sorting incoming data) plus the work done by each of its children in supplying data to it.

The example doesn’t show any parent operations with multiple children, but we can see a simple example of this ‘parent plus children’ algorithm in operations 3 and 4 above. A “hash group by” doesn’t, inherently, use the buffer cache, so line 3 can’t be responsible for the 1,623 buffer gets recorded – but they appeared at operation 4, the child to line 3 that produced the rowsource for line 3 to aggregate.

On the other hand, if a “hash group by” spills to disc it will do some writes to the temporary tablespace and then read them back. Operation 3 reports 16 blocks written and 42 blocks read, but how can there be more reads than writes – the excess of 26 is the 26 reads performed at operation 4 during the fast full scan of index i_source1.

From there on upwards the plan shows no incremental use of resources (the “sort group by” of operation 1 completes in memory so there are no further reads and writes).

The significance of the Starts column isn’t really visible in this plan since each operation starts only once. We will examine the column more closely in the next article but for the moment I’ll just point out that the E-rows (estimated) and A-rows (actual) are a very good match for most of the operations in this example, and we should be happy with that since it shows that the optimizer has (right up until the last operation) made a good prediction about the volume of data processed by our query. But the close match in the two columns is only there because the Starts column holds the value 1: E-rows is the estimated rows per execution of the operation, so when we are trying to assess the accuracy of the optimizer’s predictions we generally need to compare A-Rows with E-rows * Starts and (as we shall see in the next article) we still have to be careful about deciding when that is a useful comparison and when it is meaningless.

Summary

In this article we focused on one of the three basic strategies for acquiring rowsource execution statistics – pausing on the way to point out that one component of the full set (v$sql_workarea) can point us towards interesting SQL before we’ve even started looking in detail at any specific SQL.

We’ve looked in some detail at the way in which we can use the fundamental view v$sql_plan_statistics_all, and highlighted a potential performance threat that you need to check for on your specific platforms.

Finally, we’ve discussed how to interpret the numbers that appear when we report the statistics – though we need to look at a more complex example to appreciate fully the significance of the Starts column when trying to use the rowsource execution statistics as an aid to solving performance problems.

 

Tags: , ,