Informatica Tutorials

Big Data Analytics

Optimizing Star Queries

You should consider the following when using star queries:
■ Tuning Star Queries
■ Using Star Transformation


Tuning Star Queries
To get the best possible performance for star queries, it is important to follow some
basic guidelines:

■ A bitmap index should be built on each of the foreign key columns of the fact table
or tables.
■ The initialization parameter STAR_TRANSFORMATION_ENABLED should be set to
TRUE. This enables an important optimizer feature for star-queries. It is set to
FALSE by default for backward-compatibility.

When a data warehouse satisfies these conditions, the majority of the star queries
running in the data warehouse will use a query execution strategy known as the star
transformation. The star transformation provides very efficient query performance for
star queries.

Using Star Transformation
The star transformation is a powerful optimization technique that relies upon
implicitly rewriting (or transforming) the SQL of the original star query. The end user never needs to know any of the details about the star transformation. Oracle's query optimizer automatically chooses the star transformation where appropriate.

The star transformation is a query transformation aimed at executing star queries
efficiently. Oracle processes a star query using two basic phases. The first phase
retrieves exactly the necessary rows from the fact table (the result set). Because this retrieval utilizes bitmap indexes, it is very efficient. The second phase joins this result set to the dimension tables. An example of an end user query is: "What were the sales and profits for the grocery department of stores in the west and southwest sales districts over the last three quarters?" This is a simple star query.


Star Transformation with a Bitmap Index
A prerequisite of the star transformation is that there be a single-column bitmap index on every join column of the fact table. These join columns include all foreign key columns.

For example, the sales table of the sh sample schema has bitmap indexes on the
time_id, channel_id, cust_id, prod_id, and promo_id columns.
Consider the following star query:

SELECT ch.channel_class, c.cust_city, t.calendar_quarter_desc,
SUM(s.amount_sold) sales_amount
FROM sales s, times t, customers c, channels ch
WHERE s.time_id = t.time_id
AND s.cust_id = c.cust_id
AND s.channel_id = ch.channel_id
AND c.cust_state_province = 'CA'
AND ch.channel_desc in ('Internet','Catalog')
AND t.calendar_quarter_desc IN ('1999-Q1','1999-Q2')
GROUP BY ch.channel_class, c.cust_city, t.calendar_quarter_desc;

This query is processed in two phases. In the first phase, Oracle Database uses the
bitmap indexes on the foreign key columns of the fact table to identify and retrieve
only the necessary rows from the fact table. That is, Oracle Database will retrieve the result set from the fact table using essentially the following query:

SELECT ... FROM sales
WHERE time_id IN
(SELECT time_id FROM times
WHERE calendar_quarter_desc IN('1999-Q1','1999-Q2'))
AND cust_id IN
(SELECT cust_id FROM customers WHERE cust_state_province='CA')
AND channel_id IN
(SELECT channel_id FROM channels WHERE channel_desc IN('Internet','Catalog'));

This is the transformation step of the algorithm, because the original star query has
been transformed into this subquery representation. This method of accessing the fact
table leverages the strengths of bitmap indexes. Intuitively, bitmap indexes provide a set-based processing scheme within a relational database. Oracle has implemented
very fast methods for doing set operations such as AND (an intersection in standard
set-based terminology), OR (a set-based union), MINUS, and COUNT.

In this star query, a bitmap index on time_id is used to identify the set of all rows in the fact table corresponding to sales in 1999-Q1. This set is represented as a bitmap (a string of 1's and 0's that indicates which rows of the fact table are members of the set).

A similar bitmap is retrieved for the fact table rows corresponding to the sale from
1999-Q2. The bitmap OR operation is used to combine this set of Q1 sales with the set
of Q2 sales.

Additional set operations will be done for the customer dimension and the product
dimension. At this point in the star query processing, there are three bitmaps. Each
bitmap corresponds to a separate dimension table, and each bitmap represents the set
of rows of the fact table that satisfy that individual dimension's constraints.


How Oracle Chooses to Use Star Transformation
The optimizer generates and saves the best plan it can produce without the
transformation. If the transformation is enabled, the optimizer then tries to apply it to the query and, if applicable, generates the best plan using the transformed query.

Based on a comparison of the cost estimates between the best plans for the two
versions of the query, the optimizer will then decide whether to use the best plan for the transformed or untransformed version.

If the query requires accessing a large percentage of the rows in the fact table, it might be better to use a full table scan and not use the transformations. However, if the constraining predicates on the dimension tables are sufficiently selective that only a small portion of the fact table needs to be retrieved, the plan based on the
transformation will probably be superior.

Note that the optimizer generates a subquery for a dimension table only if it decides
that it is reasonable to do so based on a number of criteria. There is no guarantee that subqueries will be generated for all dimension tables. The optimizer may also decide, based on the properties of the tables and the query, that the transformation does not merit being applied to a particular query. In this case the best regular plan will be used.

Related Posts Plugin for WordPress, Blogger...

Please Share

Twitter Delicious Facebook Digg Stumbleupon Favorites More

 
Follow TutorialBlogs
Share on Facebook
Tweet this Blog
Add Blog to Technorati
Home