InterviewSolution
This section includes InterviewSolutions, each offering curated multiple-choice questions to sharpen your knowledge and support exam preparation. Choose a topic below to get started.
| 1. |
Explain SCD (Slowly Change Dimension). |
|
Answer» SCD (Slowly Changing Dimensions) basically keep and manage both current and historical data in a data WAREHOUSE over time. Rather than changing regularly on a time-based schedule, SCD CHANGES slowly over time. SCD is considered one of the most CRITICAL ASPECTS of ETL. |
|
| 2. |
What do you mean by factless table? |
|
Answer» Factless tables do not contain any facts or MEASURES. It CONTAINS only dimensional keys and DEALS with EVENT occurrences at the informational level but not at the calculational level. As the name implies, factless fact tables capture relationships between dimensions but lack any numerical or textual data. Factual fact tables can be categorized into two categories: one that describes events, and the other one that describes conditions. Both may have a significant impact on your dimensional modeling. |
|
| 3. |
What is the benefit of using a Data reader destination adapter? |
|
Answer» ADO Recordset holds a COLLECTION of RECORDS (records and columns) from a DATABASE table. The Data Reader Destination Adapter is very useful when it comes to populating them in a SIMPLE manner. Using the ADO.NET DataReader interface, it exposes the data in a data flow for other applications to consume it. |
|
| 4. |
What do you mean by schema objects? |
|
Answer» Generally, a schema comprises a set of database objects, such as TABLES, views, INDEXES, clusters, database links, and synonyms, etc. This is a LOGICAL description or STRUCTURE of the database. Schema objects can be arranged in various ways in schema models designed for data warehousing. STAR and snowflake schemas are two examples of data warehouse schema models. |
|
| 5. |
Explain what you mean by Bus Schema. |
|
Answer» An important part of ETL is dimension identification, and this is largely DONE by the Bus Schema. A BUS schema is actually COMPRISED of a suite of VERIFIED dimensions and uniform definitions and can be used for handling dimension identification across all businesses. To put it another way, the bus schema identifies the common dimensions and facts across all the data marts of an organization just like IDENTIFYING conforming dimensions (dimensions with the same information/meaning when being referred to different fact tables). Using the Bus schema, information is given in a standard format with PRECISE dimensions in ETL. |
|
| 6. |
Explain the Snowflake schema. |
|
Answer» Adding additional dimension tables to a Star Schema makes it a Snowflake Schema. In the Snowflake schema model, multiple hierarchies of dimension tables surround a central fact table. Alternatively, a dimension table is called a snowflake if its low-cardinality attribute has been segmented into separate NORMALIZED tables. These normalized tables are then joined with referential constraints (foreign key constraints) to the original dimensions table. Snowflake schema complexity increases linearly with the level of HIERARCHY in the dimension tables. Advantages
Disadvantages |
|
| 7. |
What do you mean by staging area and write its main purpose? |
|
Answer» During the extract, transform, and load (ETL) process, a STAGING area or landing zone is used as an intermediate storage area. It SERVES as a TEMPORARY storage area between data sources and data warehouses. Staging areas are primarily used to extract data quickly from their respective data sources, THEREFORE minimizing the impact of those sources. Using the staging area, data is combined from multiple data sources, transformed, VALIDATED, and cleaned after data has been loaded. |
|
| 8. |
What do you mean by ODS (Operational data store)? |
|
Answer» Between the staging area and the Data Warehouse, ODS SERVES as a repository for data. UPON inserting the data into ODS, ODS will load all the data into the EDW (Enterprise data warehouse). The benefits of ODS mainly pertain to business operations, as it presents current, clean data from multiple sources in ONE place. Unlike other databases, an ODS DATABASE is read-only, and customers cannot update it. |
|
| 9. |
Define Grain of Fact. |
|
Answer» Accordingly, grain FACT REFERS to the LEVEL of storing fact INFORMATION. Alternatively, it is known as Fact Granularity. |
|
| 10. |
Explain what is fact and write its type. |
|
Answer» An important aspect of data warehousing is the FACT table. A fact table basically represents the measurements, metrics, or FACTS of a business process. In fact tables, facts are stored, and they are linked to a number of dimension tables via foreign keys. Facts are usually details and/or aggregated measurements of a business process which can be calculated and grouped to address the business question. Data schemas LIKE the star schema or snowflake schema consist of a central fact table surrounded by several dimension tables. The measures or numbers like sales, cost, profit and loss, etc., are some examples of facts. Fact tables have two types of columns, foreign keys and measures columns. Foreign keys store foreign keys to dimensions, while measures contain NUMERIC facts. Other attributes can be added, depending on the business need and necessity. Types of Facts Facts can be divided into three basic types, as follows:
|
|
| 11. |
Can you define cubes and OLAP cubes? |
|
Answer» The CUBE is one of the things on which data PROCESSING relies HEAVILY. In their simplest form, CUBES are just data processing units that contain dimensions and fact tables from the Data warehouse. It provides clients with a multidimensional view of data, querying, and analysis capabilities. On the other hand, Online Analytical Processing (OLAP) is software that allows you to analyze data from several databases at the same time. For reporting purposes, an OLAP cube can be used to store data in the multidimensional form. With the cubes, creating and viewing reports BECOMES easier, as well as smoothing and improving the reporting process. The end users are responsible for managing and maintaining these cubes, who have to manually update their data. |
|
| 12. |
Mention some of the ETL bugs. |
|
Answer» FOLLOWING are a few common ETL bugs:
|
|
| 13. |
State difference between ETL testing and manual testing. |
||||||||||||||
Answer»
|
|||||||||||||||