Informatica Tutorials

Big Data Analytics

Partition Pruning Techniques

Basic Partition Pruning Techniques
The optimizer uses a wide variety of predicates for pruning. The three predicate types, equality, range, and IN-list are the most commonly used cases of partition pruning. As an example, consider the following query:
SELECT SUM(amount_sold) day_sales
FROM sales
WHERE time_id = TO_DATE('02-JAN-1998', 'DD-MON-YYYY');
Because there is an equality predicate on the partitioning column of sales, this query will prune down to a single predicate and this will be reflected in the explain plan, as shown in the following:
-----------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows| Bytes | Cost (%CPU)| Time |Pstart| Pstop |
-----------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 21 (100) | | | |
| 1 | SORT AGGREGATE | | 1 | 13 | | | | |
| 2 | PARTITION RANGE SINGLE | | 485 | 6305 | 21 (10) | 00:00:01 | 5 | 5 |
| * 3 | TABLE ACCESS FULL | SALES | 485 | 6305 | 21 (10) | 00:00:01 | 5 | 5 |
-----------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - filter("TIME_ID"=TO_DATE('1998-01-02 00:00:00', 'yyyy-mm-dd hh24:mi:ss'))
Similarly, a range or an IN-list predicate on the time_id column and the optimizer
would be used to prune to a set of partitions.
The partitioning type plays a role in which predicates can be used. Range predicates
cannot be used for pruning on hash partitioned tables while they can be used for all
other partitioning strategies. However, on list partitioned tables, range predicates do not map to a range predicates do not map to a range of partitions. Equality and IN-list predicates can be used to prune with all the partitioning methods.

Advanced Partition Pruning Techniques
Oracle also prunes in the presence of more complex predicates or SQL statements
involving partitioned tables. A common situation is when a partitioned table is joined to the subset of another table, limited by a WHERE condition. For example, consider the following query:
SELECT t.day_number_in_month, SUM(s.amount_sold)
FROM sales s, times t
WHERE s.time_id = t.time_id
AND t.calendar_month_desc='2000-12'
GROUP BY t.day_number_in_month;
If we performed a nested loop join with times on the right hand side, then we would
only access the partition corresponding to this row from the times table. But, if we
performed a hash or sort merge join, this would not be possible. If the table with the WHERE predicate is relatively small compared to the partitioned table, and the
expected reduction of records or partitions for the partitioned table is significant,
Oracle will perform dynamic partition pruning using a recursive subquery. The
decision whether or not to invoke subquery pruning is an internal cost-based decision
of the optimizer.
A sample plan using a hash join operation would look like the following:
--------------------------------------------------------------------------------------------------
| Id| Operation | Name | Rows | Bytes| Cost (%CPU)| Time | Pstart | Pstop |
--------------------------------------------------------------------------------------------------
| 0| SELECT STATEMENT | | | | 761 (100) | | | |
| 1| HASH GROUP BY | | 20 | 640 | 761 (41) |00:00:10| | |
|* 2| HASH JOIN | | 19153 | 598K | 749 (40) |00:00:09| | |
|* 3| TABLE ACCESS FULL | TIMES | 30 | 570 | 17 (6) |00:00:01| | |
| 4| PARTITION RANGE SUBQUERY | | 918K | 11M | 655 (33) |00:00:08| KEY(SQ)|KEY(SQ)|
| 5| TABLE ACCESS FULL | SALES | 918 | 11M | 655 (33) |00:00:08| KEY(SQ)|KEY(SQ)|
--------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------
2 - access("S"."TIME_ID"="T"."TIME_ID")
3 - filter("T"."CALENDAR_MONTH_DESC"='2000-12')
This plan shows that dynamic partition pruning occurred on the sales table.
Another example using advanced pruning is the following, which uses an OR
predicate:
SELECT p.promo_name promo_name, (s.profit - p.promo_cost) profit
FROM
promotions p,
(SELECT
promo_id,
SUM(sales.QUANTITY_SOLD * (costs.UNIT_PRICE - costs.UNIT_COST)) profit
FROM
sales, costs
WHERE
((sales.time_id BETWEEN TO_DATE('01-JAN-1998','DD-MON-YYYY',
'NLS_DATE_LANGUAGE = American') AND
TO_DATE('01-JAN-1999','DD-MON-YYYY', 'NLS_DATE_LANGUAGE = American')
OR
(sales.time_id BETWEEN TO_DATE('01-JAN-2001','DD-MON-YYYY',
'NLS_DATE_LANGUAGE = American') AND
TO_DATE('01-JAN-2002','DD-MON-YYYY', 'NLS_DATE_LANGUAGE = American')))
AND sales.time_id = costs.time_id
AND sales.prod_id = costs.prod_id
GROUP BY
promo_id) s
WHERE s.promo_id = p.promo_id
ORDER BY profit
DESC;
This query joins the sales and costs tables in the sh sample schema. The sales
table is partitioned by range on the column time_id. One of the conditions in the
query are two predicates on time_id, which are combined with an OR operator. This
OR predicate is used to prune the partitions in sales table and a single join between
sales and costs table is performed. The plan is as follows:
--------------------------------------------------------------------------------------------------
| Id| Operation | Name |Rows |Bytes |TmpSp|Cost(%CPU)| Time | Pstart| Pstop |
--------------------------------------------------------------------------------------------------
| 0| SELECT STATEMENT | | 4 | 200 | | 3556 (14)| 00:00:43| | |
| 1| SORT ORDER BY | | 4 | 200 | | 3556 (14)| 00:00:43| | |
|* 2| HASH JOIN | | 4 | 200 | | 3555 (14)| 00:00:43| | |
| 3| TABLE ACCESS FULL |PROMOTIONS| 503 | 16599| | 16 (0)| 00:00:01| | |
| 4| VIEW | | 4 | 68 | | 3538 (14)| 00:00:43| | |
| 5| HASH GROUP BY | | 4 | 164 | | 3538 (14)| 00:00:43| | |
| 6| PARTITION RANGE OR | | 314K| 12M| | 3321 (9)| 00:00:40|KEY(OR)|KEY(OR)|
|* 7| HASH JOIN | | 314K| 12M| 440K| 3321 (9)| 00:00:40| | |
|* 8| TABLE ACCESS FULL| SALES | 402K| 7467K| | 400 (39)| 00:00:05|KEY(OR)|KEY(OR)|
| 9| TABLE ACCESS FULL | COSTS |82112| 1764K| | 77 (24)| 00:00:01|KEY(OR)|KEY(OR)|
--------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("S"."PROMO_ID"="P"."PROMO_ID")
7 - access("SALES"."TIME_ID"="COSTS"."TIME_ID" AND "SALES"."PROD_ID"="COSTS"."PROD_ID")
8 - filter("SALES"."TIME_ID"<=TO_DATE('1999-01-01 00:00:00', 'yyyy-mm-dd hh24:mi:ss') AND
"SALES"."TIME_ID">=TO_DATE('1998-01-01 00:00:00', 'yyyy-mm-dd hh24:mi:ss') OR
"SALES"."TIME_ID">=TO_DATE('2001-01-01 00:00:00', 'yyyy-mm-dd hh24:mi:ss') AND
"SALES"."TIME_ID"<=TO_DATE('2002-01-01 00:00:00', 'yyyy-mm-dd hh24:mi:ss'))
Oracle also does additional pruning when a column is range partitioned on multiple
columns. As long as Oracle can guarantee that a particular predicate cannot be
satisfied in a particular partition, the partition will be skipped. This allows Oracle to
optimize cases where there are range predicates on more than one column or in the
case where there are no predicates on a prefix of the partitioning columns.

Related Posts Plugin for WordPress, Blogger...

Please Share

Twitter Delicious Facebook Digg Stumbleupon Favorites More

 
Follow TutorialBlogs
Share on Facebook
Tweet this Blog
Add Blog to Technorati
Home