While looking at the results from a query against the dynamic performance view v$active_session_history during a recent trouble-shooting assignment I noticed a result that was so unusual that I decided I needed to get some information from the v$ash_info view over an interval of about 3 minutes – and it didn’t matter which three minutes in particular, any three (consecutive) minutes would do.

The drawback, of course, is that v$ash_info is one of those views that shows a summary since instance startup, so I needed to take a couple of snapshots and find changes in values. Just to make the task a little more awkward, I didn’t have access to the system so I needed to send in a script for someone else to run.

Luckily this was 12c and there’s a feature in 12c that makes it easy to write code that takes timed deltas of any particular dataset so, as a first step, I wrote a little bit code that took 20 consecutive snapshots with a 10 seconds gap between snapshots; this is what it looked like:

rem
rem     Script:         ash_info.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Jan 2017
rem

define m_snapshot_time  = 10
define m_snapshot_count = 20

set linesize 180
set pagesize 60
set trimspool on

set arraysize 60

prompt  ===========================================
prompt  Reporting v$ash_info &m_snapshot_count times
prompt  with a snapshot every &m_snapshot_time seconds
prompt  Please wait
prompt  ===========================================

with
        function wait_row_n (
                i_secs          number,
                i_return        number default -1
        ) return number
        is
        begin
                dbms_lock.sleep(i_secs);
                return i_return;
        end wait_row_n;
select
        /*+
                leading(looper aio)
                use_nl(aio)
        */
        wait_row_n(&m_snapshot_time, looper.ct) sleeper,
        aio.*
from
        (
        select
                rownum  ct
        from
                dual
        connect by
                level <= &m_snapshot_count
        )       looper,
        v$ash_info aio
/

There are a few details worth noting in this query but the important 12c one is the feature of creating a PL/SQL function in a “WITH” clause of an SQL statement. Historically if you wanted to create and use a PL/SQL function from SQL the function had to be a standalone object (possibly packaged) that you called from SQL. One of the drawbacks to this was that you could run into privilege problems. In the above example my function calls the dbms_lock package. I have received all my privileges from a role (called test_role) that has been granted execute on the dbms_lock package, and the SQL above works (note: the role test_role has also been granted select on v_$ash_info). Look what happens, though, if I try to create a standalone PL/SQL function to replace the “factored subquery” function:

SQL> create or replace procedure my_sleep(n in number) as
  2  begin
  3          sys.dbms_lock.sleep(n);
  4  end;
  5  /

Warning: Procedure created with compilation errors.

SQL> show errors
Errors for PROCEDURE MY_SLEEP:

LINE/COL ERROR
-------- -----------------------------------------------------------------
3/9      PL/SQL: Statement ignored
3/9      PLS-00201: identifier 'SYS.DBMS_LOCK' must be declared

I can’t use the dbms_lock package in a permanent object in my schema if I’ve only got the privilege to execute it granted through a role. If I connect as SYS and grant execute on dbms_lock to my schema (called test_user) I can reconnect as test_user and the create procedure call will succeed.

Inevitably there are pros and cons to embedding PL/SQL functions withing SQL so don’t take this note as a suggestion that you should always inline any PL/SQL functions you want to use – it just happens to be very convenient in this particular case.

The other details worth a little mention in this script are the hinting and the use of the slash character to execute the statement.

The basic reason the script does what I want is because the dynamic performance views don’t produce read-consistent results. My query against dual produces N rows, one every M seconds; if v$ashinfo were a normal Oracle table (or view) it would produce the same result for every one of those N rows if I used it as the second table in a nested loop join, but because it’s a dynamic performance view what I actually get is a different result each time I visit it. In the absence of the two hints Oracle could take an path that scanned v$ashinfo once and did a merge join with the N rows I get from dual, giving me N copies of the same values. Both the leading() hint and the use_nl() hint are critical to getting the results I want.

I’ve used the “/” to execute the query – a semi-colon doesn’t work. Technically a semi-colon is a separator, but historically SQL*Plus has used it as an alternative “run” command; once you introduce a “with function” clause – which will contain semi-colons for the PL/SQL component of the text – the alternative meaning of the semi-colon ceases to be used and you have to use the “/” to run the statement. If you change the “/” to a semi-colon you’ll find that SQL*Plus treats the next bit of text as if it were part of the statement (and you’ll probably see error “ORA-00933: SQL command not properly ended” as Oracle tries to interpret it).

Interestingly, but unsurprisingly, blank lines in the SQL are automatically acceptable once you’ve introduced the “with function” clause irrespective of the setting the SQL*Plus environment variable SQLBLANKLINES.

Deltas

My first cut of the script simply lists all the absolute values from the view. As a starting point that’s probably good enough for what I wanted. (In fact, if I had had open access to the system I wouldn’t have bothered with the script, I’d simply have run “select * from V$ash_info” a few times while spooling to a file – after setting a suitable linesize and a few column formats, of course.) From the viewpoint of future convenience, though, it would be nice to have a script that got the data but showed the change from snapshot to snapshot. I had to increase the complexity of the SQL a little bit by using one of the simplest analytic functions to achieve this then, after a little stress testing, I decided to add a few enhancements to the first version of the code so currently my code looks like this:

column delta_seconds format 999.999
column snap_time     format a19
column sample_time   format a19


alter session set statistics_level = all;

with
        function wait_row(
                i_secs          number,
                i_counter       number default -1
        ) return timestamp
        is
                m_time  timestamp;
        begin
                m_time := systimestamp;
                dbms_lock.sleep(i_secs);
                return m_time;
        end wait_row;

base_data as (
        select
                /*+
                        leading(looper aio)
                        use_nl(aio)
                        materialize
                */
                wait_row(&m_snapshot_time, looper.ct)   timer,
                aio.*
        from
                (
                select
                        rownum  ct
                from
                        dual
                connect by
                        level <= &m_snapshot_count
                )       looper,
                v$ash_info aio
)
select
        to_char(timer,'hh24:mi:ssxff')                                  snap_time,
        extract(second from
                        (timer - lag(timer,1,timer) over (order by timer))
        )                                                               delta_seconds,
        latest_sample_id                                                sample_id,
        latest_sample_id -
                lag(latest_sample_id,1,latest_sample_id) over (order by timer)  delta,
        to_char(latest_sample_time,'hh24:mi:ssxff')                     sample_time,
        extract(second from
                        (
                        latest_sample_time -
                                lag(latest_sample_time,1,latest_sample_time) over (order by timer)
                        )
        )                                                               delta_seconds,
        sample_count -
                lag(sample_count,1,0) over (order by timer)             samples,
        dropped_sample_count -
                lag(dropped_sample_count,1,0) over (order by timer)     dropped_samples,
        sampled_bytes -
                lag(sampled_bytes,1,0) over (order by timer)            sampled_bytes,
        sampler_elapsed_time -
                lag(sampler_elapsed_time,1,0) over (order by timer)     sampling_musec
from
        base_data
order by
        1
/


Again there are a couple of little details to note. For demonstration purposes I’ve left a few blank lines in the middle of the code just to demonstrate the principle. More importantly I’ve got a “with function” and a “with subquery” in the same statement – you’ll notice that the function bit ends with a semi-colon (though technically that should be “is separated from the next block by” rather than “ends with”. If I wanted another function I could start it’s definition immediately before the definition of base_data factored subquery. It’s a quirky little detail but multiple functions are separated by semi-colons while multiple subqueries are separated by commas.

You can also see a less well-known detail of the lag() analytic function: I’ve set the 2nd (i.e. offset) parameter explicitly to 1 (the default) in each case and the third parameter to the value I want Oracle to supply if the offset takes Oracle outside the current window. I’ve supplied zero for most of the columns and this conveniently means that the first row of output shows the absolute starting values, but I’ve repeated the original column name in two places so that the output for the first row shows zeros.

An important enhancement to the output from the query is the time at which each query against v$ash_info was actually executed. You’ll see why I did this when we review the results. I’ve changed my function to return systimestamp as at the moment the function was called then included both the time of the call and the change in time since the previous call in my output. After including this detail I had to materialize the subquery because Oracle seemed to be copying the function call into the select list rather than calling the function once per row and then re-using the result – which may be a change in behaviour that could affect some other pieces of SQL that I’ve written in the past.

Here’s an example of the output:

===========================================
Reporting v$ash_info 20 times
with a snapshot every 10 seconds
Please wait
===========================================

SNAP_TIME           DELTA_SECONDS  SAMPLE_ID      DELTA SAMPLE_TIME         DELTA_SECONDS    SAMPLES DROPPED_SAMPLES SAMPLED_BYTES SAMPLING_MUSEC
------------------- ------------- ---------- ---------- ------------------- ------------- ---------- --------------- ------------- --------------
12:14:58.556150000           .000    4745434          0 12:14:57.825000000           .000      16432               3       7016168        4963931
12:15:08.593213000         10.037    4745444         10 12:15:08.215000000         10.390         10               0         14024           1327
12:15:18.619130000         10.026    4745454         10 12:15:18.605000000         10.390         10               0         15712           1393
12:15:28.634358000         10.015    4745463          9 12:15:27.885000000          9.280          9               0         14368           1552
12:15:38.722004000         10.088    4745473         10 12:15:38.255000000         10.370         10               0         20240           1408
12:15:48.725934000         10.004    4745483         10 12:15:48.676000000         10.421         10               0         20280           1185
12:15:58.768118000         10.042    4745492          9 12:15:58.036000000          9.360          9               0         18280            653
12:16:08.774180000         10.006    4745502         10 12:16:08.436000000         10.400         10               0         20432            959
12:16:18.776470000         10.002    4745511          9 12:16:17.796000000          9.360          9               0         15208            954
12:16:28.824398000         10.048    4745521         10 12:16:28.196000000         10.400         10               0         20832           1229
12:16:38.828178000         10.004    4745531         10 12:16:38.596000000         10.400         10               0         16816           1021
12:16:48.833060000         10.005    4745540          9 12:16:47.956000000          9.360          9               0         16112           1179
12:16:58.880889000         10.048    4745550         10 12:16:58.346000000         10.390         10               0         17888           1274
12:17:08.884090000         10.003    4745560         10 12:17:08.736000000         10.390         10               0         17904           1085
12:17:18.888500000         10.004    4745569          9 12:17:18.096000000          9.360          9               0         16128           1207
12:17:28.938052000         10.050    4745579         10 12:17:28.496000000         10.400         10               0         16224           1294
12:17:38.940824000         10.003    4745588          9 12:17:37.896000000          9.400          9               0         10544            830
12:17:49.009051000         10.068    4745598         10 12:17:48.292000000         10.396         10               0         10584           1136
12:17:59.030062000         10.021    4745608         10 12:17:58.612000000         10.320         10               0          8888           1286
12:18:09.039469000         10.009    4745618         10 12:18:08.822000000         10.210         10               0          6976           1065

20 rows selected.

I was trying to overload the CPUs by running up a set of CPU intensive queries on a small instance running on a VM when I ran this report. This introduced two problems for my report; the first is that when the CPU is overloaded (by about 400% in my case) my session might find it hard to get onto the CPU in a timely fashion, so the time between lines of the report might be more than the 10 seconds I requested. The second problem is the same problem – but for mmnl, the process that constructs the ASH data; it’s supposed to take a snapshot once per second but it, too, might not get onto the CPU when it needs to.

Looking at the output above we can see that my session had less of a problem than mmnl. My snapshot time is drifting by a few hundredths of a second on each sleep; on the other hand we know that mmnl should be taking samples once per second so with our 10 second snapshot we should see the sample count going up by 10 and the sample time going up by 10 seconds on each snapshot – but when we see 10 samples we see the sample time going up by around 10.4 seconds, and there’s a “self-correcting” element in place that means we sometimes see only 9 samples taken in our 10 second window. 

On the plus side, even though the system is under some pressure due to excess CPU utilisation we can see that we have no “dropped samples”. The column name is a little misleading – it’s not a count of been dropped samples, it’s a count of the number active sessions for a sample that were omitted because the sampler couldn’t get a self-consistent image of that session. Ironically, when the CPU is overloaded it’s probably the sessions that can’t be captured that are the most interesting and important sessions.

Conclusion

I’ve demonstrated a method of using a new 12c feature to report change over time from a dynamic performance view that holds only information accumulated since instance startup. A key point of the new feature is that it allows a user with role-based privileges to do things that were previously only possible to a user with schema-based privileges.

The method is generic, but the specific example I chose also allowed me to make a couple of points about performance monitoring, effects of CPU overload, and the need for care in cross-checking figures.