Informatica Tutorials

Big Data Analytics

What are Dimensions?

A dimension is a structure that categorizes data in order to enable users to answer
business questions. Commonly used dimensions are customers, products, and time.
For example, each sales channel of a clothing retailer might gather and store data
regarding sales and reclamations of their Cloth assortment. The retail chain
management can build a data warehouse to analyze the sales of its products across all
stores over time and help answer questions such as:

■ What is the effect of promoting one product on the sale of a related product that is
not promoted?
■ What are the sales of a product before and after a promotion?
■ How does a promotion affect the various distribution channels?

The data in the retailer's data warehouse system has two important components:
dimensions and facts. The dimensions are products, customers, promotions, channels,
and time. One approach for identifying your dimensions is to review your reference
tables, such as a product table that contains everything about a product, or a
promotion table containing all information about promotions. The facts are sales (units sold) and profits. A data warehouse contains facts about the sales of each product at on a daily basis.

A typical relational implementation for such a data warehouse is a star schema. The
fact information is stored in what is called a fact table, whereas the dimensional
information is stored in dimension tables. In our example, each sales transaction
record is uniquely defined as for each customer, for each product, for each sales
channel, for each promotion, and for each day (time).

In Oracle Database, the dimensional information itself is stored in a dimension table.
In addition, the database object dimension helps to organize and group dimensional
information into hierarchies. This represents natural 1:n relationships between
columns or column groups (the levels of a hierarchy) that cannot be represented with
constraint conditions. Going up a level in the hierarchy is called rolling up the data and going down a level in the hierarchy is called drilling down the data. In the retailer
■ Within the time dimension, months roll up to quarters, quarters roll up to years,
and years roll up to all years.
■ Within the product dimension, products roll up to subcategories, subcategories
roll up to categories, and categories roll up to all products.
■ Within the customer dimension, customers roll up to city. Then cities roll up to
state. Then states roll up to country. Then countries roll up to subregion.
Finally, subregions roll up to region, as shown in following figure

Data analysis typically starts at higher levels in the dimensional hierarchy and
gradually drills down if the situation warrants such analysis.
Dimensions do not have to be defined. However, if your application uses dimensional
modeling, it is worth spending time creating them as it can yield significant benefits, because they help query rewrite perform more complex types of rewrite.

Dimensions are also beneficial to certain types of materialized view refresh operations and with the SQL Access Advisor. They are only mandatory if you use the SQL Access Advisor (a GUI tool for materialized view and index management) without a workload to recommend which materialized views and indexes to create, drop, or retain.

In spite of the benefits of dimensions, you must not create dimensions in any schema
that does not fully satisfy the dimensional relationships described in this chapter.
Incorrect results can be returned from queries otherwise.

Related Posts Plugin for WordPress, Blogger...

Please Share

Twitter Delicious Facebook Digg Stumbleupon Favorites More

Follow TutorialBlogs
Share on Facebook
Tweet this Blog
Add Blog to Technorati