| 1. |
What Is Data Cleaning? How Can We Do That? |
|
Answer» Data cleaning is a self-explanatory TERM. Most of the data warehouses in the world source data from multiple SYSTEMS - systems that were created LONG before data warehousing was well understood, and HENCE without the vision to consolidate the same in a single repository of information. In such a scenario, the possibilities of the following are there: ► Missing information for a column from one of the data sources; In order to ensure that the data warehouse is not infected by any of these discrepancies, it is important to cleanse the data using a set of business rules, before it makes its way into the data warehouse. Data cleaning is a self-explanatory term. Most of the data warehouses in the world source data from multiple systems - systems that were created long before data warehousing was well understood, and hence without the vision to consolidate the same in a single repository of information. In such a scenario, the possibilities of the following are there: ► Missing information for a column from one of the data sources; In order to ensure that the data warehouse is not infected by any of these discrepancies, it is important to cleanse the data using a set of business rules, before it makes its way into the data warehouse. |
|