19 May 2014

7533 views

4 0

Parallel Execution Skew – 12c Hybrid Hash Distribution With Skew Detection

Since Parallel Execution Skew is a common problem that can severely limit the scalability of Parallel Execution (see my previous article for a demonstration), Oracle has introduced a new feature in 12c that can address the problem automatically. That’s great, but unfortunately – at least in the initial 12c release – it only applies to a limited number of scenarios, so we still need to look at ways how to address the problem manually. Nevertheless it’s a great feature and a good starting point, maybe in future releases Parallel Execution Skew will be addressed automatically by the database in all the most common scenarios. Until that happens, we’ll have to resort to workarounds.

For the feature to be triggered automatically Oracle needs a histogram on the join expression. Oracle will use the histogram information to check for popular values – if the histogram shows that such popular values exist a special code path in the optimizer will be triggered that runs a recursive query on the table during optimization to generate a list of such popular values. These values will actually be hard-coded into the cursor information, and the HYBRID HASH distribution of “the other” row source at run time will check the data to distribute against this list of popular values in the cursor. If there is a match the value will be selectively distributed via BROADCAST to all receiving Parallel Execution Servers (all other values will still use the HASH distribution), and for the table with the skewed, popular values data that matches won’t be distributed via HASH, but simply in a random / round-robin fashion.

This way, any data having the popular values will be evenly distributed among the receiving Parallel Execution Servers, and due to the BROADCAST of the popular values in the other row source the join is still guaranteed to produce correct results, as the join will find a match no matter which worker process gets the values assigned.

Note: If you’re looking for more details how this feature works, see this accompanying post on my blog.

Let’s see that using our previous example. For the feature to be triggered we need to add a histogram on the corresponding column:

1	exec dbms_stats.gather_table_stats(null, 't_2', method_opt=>'for all columns size 1 for columns fk_id_skew size 254', no_invalidate=>false)

and re-execute the parallel variant of our query:

select count(t_2_filler) from (

select /*+ monitor

leading(t_1 t_2)

use_hash(t_2)

no_swap_join_inputs(t_2)

pq_distribute(t_2 hash hash) */

t_1.id as t_1_id

, t_1.filler as t_1_filler

, t_2.id as t_2_id

, t_2.filler as t_2_filler

from t_1

, t_2

where

t_2.fk_id_skew = t_1.id

and regexp_replace(t_2.filler, '^\s+([[:alnum:]]+)\s+$', lpad('\1', 10), 1, 1, 'c') >= regexp_replace(t_1.filler, '^\s+([[:alnum:]]+)\s+$', lpad('\1', 10), 1, 1, 'c')

);

On my test system the query is back to 15 seconds, so that’s a great improvement compared to the 45 seconds it took without the new feature.

We can see the good distribution confirmed by looking at the output of V$PQ_TQSTAT again. By looking at the computed sub-totals we can also see confirmed that Oracle has actually selectively distributed the popular value for the “other” row source T_1, as indicated by 2000003 rows being produced / consumed rather than the actual 2000000 rows that the row source consists of – so the popular value “1” was broadcast to all four worker processes:

break on dfo_number nodup on tq_id nodup on server_type skip 1 nodup on instance nodup

compute sum label Total of num_rows on server_type

select

/*dfo_number

, */tq_id

, cast(server_type as varchar2(10)) as server_type

, instance

, cast(process as varchar2(8)) as process

, num_rows

, round(ratio_to_report(num_rows) over (partition by dfo_number, tq_id, server_type) * 100) as "%"

, cast(rpad('#', round(num_rows * 10 / nullif(max(num_rows) over (partition by dfo_number, tq_id, server_type), 0)), '#') as varchar2(10)) as graph

, round(bytes / 1024 / 1024) as mb

, round(bytes / nullif(num_rows, 0)) as "bytes/row"

from

v$pq_tqstat

order by

dfo_number

, tq_id

, server_type desc

, instance

, process

;

TQ_ID SERVER_TYP INSTANCE PROCESS NUM_ROWS % GRAPH MB bytes/row

---------- ---------- ---------- -------- ---------- ---------- ---------- ---------- ----------

0 Producer 1 P004 493866 25 ########## 51 109

P005 489961 24 ########## 51 109

P006 508088 25 ########## 53 109

P007 508088 25 ########## 53 109

********** ********** ----------

Total 2000003

Consumer 1 P000 499615 25 ########## 52 109

P001 500735 25 ########## 52 109

P002 499522 25 ########## 52 109

P003 500131 25 ########## 52 109

********** ********** ----------

Total 2000003

1 Producer 1 P004 482877 24 ######### 49 107

P005 497862 25 ########## 51 106

P006 509934 25 ########## 52 107

P007 509327 25 ########## 52 106

********** ********** ----------

Total 2000000

Consumer 1 P000 500094 25 ########## 51 107

P001 499886 25 ########## 51 107

P002 499484 25 ########## 51 107

P003 500536 25 ########## 51 107

********** ********** ----------

Total 2000000

2 Producer 1 P000 1 25 ########## 0 36

P001 1 25 ########## 0 36

P002 1 25 ########## 0 36

P003 1 25 ########## 0 36

********** ********** ----------

Total 4

Consumer 1 QC 4 100 ########## 0 36

********** ********** ----------

Total 4

Here is the corresponding execution plan from 12c:

----------------------------------------------------------------------------------

----------------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | | | |

| 1 | SORT AGGREGATE | | | | |

| 2 | PX COORDINATOR | | | | |

| 3 | PX SEND QC (RANDOM) | :TQ10002 | Q1,02 | P->S | QC (RAND) |

| 4 | SORT AGGREGATE | | Q1,02 | PCWP | |

|* 5 | HASH JOIN | | Q1,02 | PCWP | |

| 6 | PX RECEIVE | | Q1,02 | PCWP | |

| 7 | PX SEND HYBRID HASH | :TQ10000 | Q1,00 | P->P | HYBRID HASH|

| 8 | STATISTICS COLLECTOR | | Q1,00 | PCWC | |

| 9 | PX BLOCK ITERATOR | | Q1,00 | PCWC | |

|* 10 | TABLE ACCESS FULL | T_1 | Q1,00 | PCWP | |

| 11 | PX RECEIVE | | Q1,02 | PCWP | |

| 12 | PX SEND HYBRID HASH (SKEW)| :TQ10001 | Q1,01 | P->P | HYBRID HASH|

| 13 | PX BLOCK ITERATOR | | Q1,01 | PCWC | |

|* 14 | TABLE ACCESS FULL | T_2 | Q1,01 | PCWP | |

----------------------------------------------------------------------------------

The new feature is indicated in operation ID = 12 by the keyword “(SKEW)” in the “PX SEND HYBRID HASH (SKEW)” operation. By the way, I believe that the “PQ Distrib” column is supposed to show the same “HYBRID HASH (SKEW)” distribution but is simply truncated in the output. According to my description above in my opinion the other PX SEND HYBRID HASH operation (the distribution of T_1) should also be flagged with “(SKEW)” because it doesn’t do a simple hash distribution, but does a mixed BROADCAST/HASH distribution.

But as pointed out before, the SKEW aware distribution feature at present does only work in certain scenarios. For example, if you make the join an outer join, or if you replace the full table scan on T_2 with a simple unmerged view, or make T_2 the build row source rather than then probe row source, then the feature isn’t triggered:

Outer Join example:

select count(t_1_filler) from (

select /*+ monitor

leading(t_1 t_2)

use_hash(t_2)

no_swap_join_inputs(t_2)

pq_distribute(t_2 hash hash) */

t_1.id as t_1_id

, t_1.filler as t_1_filler

, t_2.id as t_2_id

, t_2.filler as t_2_filler

from t_1

, t_2

where

t_2.fk_id_skew (+) = t_1.id

and regexp_replace(t_2.filler (+), '^\s+([[:alnum:]]+)\s+$', lpad('\1', 10), 1, 1, 'c') >= regexp_replace(t_1.filler, '^\s+([[:alnum:]]+)\s+$', lpad('\1', 10), 1, 1, 'c')

);

T_2 as build row source:

select count(t_2_filler) from (

select /*+ monitor

leading(t_1 t_2)

use_hash(t_2)

swap_join_inputs(t_2)

pq_distribute(t_2 hash hash) */

t_1.id as t_1_id

, t_1.filler as t_1_filler

, t_2.id as t_2_id

, t_2.filler as t_2_filler

from t_1

, t_2

where

t_2.fk_id_skew = t_1.id

and regexp_replace(t_2.filler, '^\s+([[:alnum:]]+)\s+$', lpad('\1', 10), 1, 1, 'c') >= regexp_replace(t_1.filler, '^\s+([[:alnum:]]+)\s+$', lpad('\1', 10), 1, 1, 'c')

);

T_2 as simple view row source:

select count(t_2_filler) from (

select /*+ monitor

leading(t_1 t_2)

use_hash(t_2)

no_swap_join_inputs(t_2)

pq_distribute(t_2 hash hash) */

t_1.id as t_1_id

, t_1.filler as t_1_filler

, t_2.id as t_2_id

, t_2.filler as t_2_filler

from t_1

, (select /*+ no_merge */ * from t_2) t_2

where

t_2.fk_id_skew = t_1.id

and regexp_replace(t_2.filler, '^\s+([[:alnum:]]+)\s+$', lpad('\1', 10), 1, 1, 'c') >= regexp_replace(t_1.filler, '^\s+([[:alnum:]]+)\s+$', lpad('\1', 10), 1, 1, 'c')

);

All these queries are back to 45 seconds on my test system and the execution plan doesn’t show any sight of the SKEW detection feature.

It’s also worth pointing out that the special SKEW aware distribution is only supported with hash joins, so, for example, using a parallel MERGE JOIN instead in the previous example won’t trigger the SKEW aware distribution either.

Furthermore if you happen to have joins to more than two tables then the join order dictates whether the feature might get used or not.

For example this three table join makes use of the feature:

select count(t_2_filler) from (

select /*+ monitor

leading(a b c)

use_hash(b c)

no_swap_join_inputs(b)

no_swap_join_inputs(c)

pq_distribute(b hash hash)

pq_distribute(c hash hash)

a.id as t_1_id

, a.filler as t_1_filler

, c.id as t_2_id

, c.filler as t_2_filler

from t_1 a

, t_1 b

, t_2 c

where

c.fk_id_skew = b.id

and regexp_replace(c.filler, '^\s+([[:alnum:]]+)\s+$', lpad('\1', 10), 1, 1, 'c') >= regexp_replace(b.filler, '^\s+([[:alnum:]]+)\s+$', lpad('\1', 10), 1, 1, 'c')

and a.id = b.id

and regexp_replace(a.filler, '^\s+([[:alnum:]]+)\s+$', lpad('\1', 10), 1, 1, 'c') >= regexp_replace(b.filler, '^\s+([[:alnum:]]+)\s+$', lpad('\1', 10), 1, 1, 'c')

);

But this one here uses slightly different joins and a different join order that joins T_2 first to another instance of T_1, so it doesn’t make use of the feature and therefore suffers again from skew:

select count(t_2_filler) from (

select /*+ monitor

leading(a c b)

use_hash(b c)

swap_join_inputs(b)

no_swap_join_inputs(c)

pq_distribute(b hash hash)

pq_distribute(c hash hash)

a.id as t_1_id

, a.filler as t_1_filler

, c.id as t_2_id

, c.filler as t_2_filler

from t_1 a

, t_1 b

, t_2 c

where

c.fk_id_skew = b.id

and regexp_replace(c.filler, '^\s+([[:alnum:]]+)\s+$', lpad('\1', 10), 1, 1, 'c') >= regexp_replace(b.filler, '^\s+([[:alnum:]]+)\s+$', lpad('\1', 10), 1, 1, 'c')

and a.id = c.id

and regexp_replace(a.filler, '^\s+([[:alnum:]]+)\s+$', lpad('\1', 10), 1, 1, 'c') >= regexp_replace(c.filler, '^\s+([[:alnum:]]+)\s+$', lpad('\1', 10), 1, 1, 'c')

);

So what can we do to address Parallel Execution Skew manually, in cases where in 12c the new skew aware distribution doesn’t work yet or in versions prior 12c? We’ll have a look at that in the next part of the series.

Parallel Execution Skew – 12c Hybrid Hash Distribution With Skew Detection

Subscribe for more articles

Rate this article

Randolf Geist

Related articles

Parallel Execution Skew – Skew Caused By Outer Joins

A Guide to INSERT, UPDATE, and DELETE Statements in Oracle

Tags