Informatica Tutorials

Big Data Analytics

ETL Basics in Data Warehousing

What happens during the ETL process? The following tasks are the main actions in the
process.

Extraction of Data
During extraction, the desired data is identified and extracted from many different
sources, including database systems and applications. Very often, it is not possible to identify the specific subset of interest, therefore more data than necessary has to be extracted, so the identification of the relevant data will be done at a later point in time.

Depending on the source system's capabilities (for example, operating system
resources), some transformations may take place during this extraction process. The
size of the extracted data varies from hundreds of kilobytes up to gigabytes,
depending on the source system and the business situation. The same is true for the
time delta between two (logically) identical extractions: the time span may vary
between days/hours and minutes to near real-time. Web server log files, for example,
can easily grow to hundreds of megabytes in a very short period of time.

Transportation of Data
After data is extracted, it has to be physically transported to the target system or to an intermediate system for further processing. Depending on the chosen way of
transportation, some transformations can be done during this process, too. For
example, a SQL statement which directly accesses a remote target through a gateway
can concatenate two columns as part of the SELECT statement.

The emphasis in many of the examples in this section is scalability. Many long-time
users of Oracle Database are experts in programming complex data transformation
logic using PL/SQL. These chapters suggest alternatives for many such data
manipulation operations, with a particular emphasis on implementations that take
advantage of Oracle's new SQL functionality, especially for ETL and the parallel query infrastructure

Related Posts Plugin for WordPress, Blogger...

Please Share

Twitter Delicious Facebook Digg Stumbleupon Favorites More

 
Follow TutorialBlogs
Share on Facebook
Tweet this Blog
Add Blog to Technorati
Home