Informatica Tutorials

Big Data Analytics

Overview of Extraction in Data Warehouses

Extraction is the operation of extracting data from a source system for further use in a data warehouse environment. This is the first step of the ETL process. After the
extraction, this data can be transformed and loaded into the data warehouse.
The source systems for a data warehouse are typically transaction processing
applications. For example, one of the source systems for a sales analysis data
warehouse might be an order entry system that records all of the current order
activities.

Designing and creating the extraction process is often one of the most time-consuming
tasks in the ETL process and, indeed, in the entire data warehousing process. The
source systems might be very complex and poorly documented, and thus determining
which data needs to be extracted can be difficult. The data has to be extracted
normally not only once, but several times in a periodic manner to supply all changed
data to the data warehouse and keep it up-to-date. Moreover, the source system
typically cannot be modified, nor can its performance or availability be adjusted, to
accommodate the needs of the data warehouse extraction process.

These are important considerations for extraction and ETL in general. This chapter,
however, focuses on the technical considerations of having different kinds of sources
and extraction methods. It assumes that the data warehouse team has already
identified the data that will be extracted, and discusses common techniques used for
extracting data from source databases.

Designing this process means making decisions about the following two main aspects:
■ Which extraction method do I choose?
This influences the source system, the transportation process, and the time needed
for refreshing the warehouse.
■ How do I provide the extracted data for further processing?

This influences the transportation method, and the need for cleaning and
transforming the data.

Related Posts Plugin for WordPress, Blogger...

Please Share

Twitter Delicious Facebook Digg Stumbleupon Favorites More

 
Follow TutorialBlogs
Share on Facebook
Tweet this Blog
Add Blog to Technorati
Home