ETL Testing Interview Questions

ETL Testing Interview Questions

    • What do you understand by an ETL?

ETL stands for Extract, Transform, and Load. It is an important concept in Data Warehousing systems. Extraction stands for extracting data from different data sources such as transactional systems or applications. Transformation stands for applying the conversion rules on data so that it becomes suitable for analytical reporting. The loading process involves moving the data into the target system, normally a data warehouse.

ETL Testing Interview Questions

    • Explain the 3-layer architecture of an ETL cycle

The three layers involved in an ETL cycle are –
Staging Layer − The staging layer is used to store the data extracted from different source data systems.
Data Integration Layer − The integration layer transforms the data from the staging layer and moves the data to a database, where the data is arranged into hierarchical groups, often called dimensions, and into facts and aggregate facts. The combination of facts and dimensions tables in a DW system is called a schema.
Access Layer − The access layer is used by end-users to retrieve the data for analytical reporting.

ETL Testing Interview Questions

    • What is the difference between and ETL and BI tools?

An ETL tool is used to extract data from different data sources, transform the data, and load it into a DW system. In contrast, a BI tool is used to generate interactive and adhoc reports for end-users, dashboard for senior management, data visualizations for monthly, quarterly, and annual board meetings.
Most common ETL tools include − SAP BO Data Services (BODS), Informatica, Microsoft – SSIS, Oracle Data Integrator ODI, Talend Open Studio, Clover ETL Open source, etc.
Most common BI tools include − SAP Business Objects, SAP Lumira, IBM Cognos, JasperSoft, Microsoft BI Platform, Tableau, Oracle Business Intelligence Enterprise Edition, etc.

ETL Testing Interview Questions

    • What are the popular ETL tools available in the market?

The popular ETL tools available in the market are –

Informatica − Power Center
IBM − Websphere DataStage (Formerly known as Ascential DataStage)
SAP − Business Objects Data Services BODS
IBM − Cognos Data Manager (Formerly known as Cognos Decision Stream)
Microsoft − SQL Server Integration Services SSIS
Oracle − Data Integrator ODI (Formerly known as Sunopsis Data Conductor)
SAS − Data Integration Studio
Oracle − Warehouse Builder
Open source Clover ETL

ETL Testing Interview Questions

    • Why do we need a staging area in an ETL process?

Staging area is an intermediate area that sits between data sources and data warehouse/data marts systems. Staging areas can be designed to provide many benefits, but the primary motivations for their use are to increase efficiency of ETL processes, ensure data integrity, and support data quality operations.

ETL Testing Interview Questions

    • What is the difference between data warehousing and data mining?

Data warehousing is a broader concept as compared to data mining. Data mining involves extracting hidden information from data and interpret it for future predictions. In contrast data warehousing includes operations such as analytical reporting to generate detailed reports and ad-hoc reports, information processing to generate interactive dashboards and charts.

ETL Testing Interview Questions

    • What are the structural differences between an OLTP and OLAP system?

OLTP stands for Online Transactional Processing system which is commonly a relational database and is used to manage day-to-day transactions.
OLAP stands for Online Analytical Processing system which is commonly a multidimensional system and is also called data warehouse.

ETL Testing Interview Questions

    • What is a Dimension table and how is it different from a Fact table?

Suppose a company sells its products to customers. Every sale is a fact that takes place within the company and the fact table is used to record these facts. Each fact table stores the primary keys to join the fact table to dimension tables and measures/facts.

Example − Fact_Units
Cust_ID Prod_Id Time_Id No. of units sold
101 24 1 25
102 25 2 15
103 26 3 30
A dimension table stores attributes or dimensions that describe the objects in a fact table. It is a set of companion tables to a fact table.
Example − Dim_Customer
Cust_id Cust_Name Gender
101 Jason M
102 Anna F

ETL Testing Interview Questions

    • What is a Data Mart?

A data mart is a simple form of data warehouse and it is focused on a single functional area. It usually gets data only from a few sources.
Example − In an organization, data marts may exists for Finance, Marketing, Human Resource, and other individual departments which store data related to their specific functions.

ETL Testing Interview Questions

    • What is an Aggregate function? Name a few common aggregate functions.

Aggregate functions are used to group multiple rows of a single column to form a more significant measurement. They are also used for performance optimization when we save aggregated tables in data warehouse.

Common Aggregate functions are –

MIN returns the smallest value in a given column
MAX returns the largest value in a given column
SUM returns the sum of the numeric values in a given column
AVG returns the average value of a given column
COUNT returns the total number of values in a given column
COUNT(*) returns the number of rows in a table

SELECT AVG(salary)
FROM employee
WHERE title = ‘developer’;

ETL Testing Interview Questions


This information box about the author only appears if the author has biographical information. Otherwise there is not author box shown. Follow YOOtheme on Twitter or read the blog.
+91 9952948899