4.2. Data CleaningΒΆ
Data cleaning is an important process to ensure quality, representativeness, and unbiased.
The data cleaning process considers the following steps:
selection of clean subsets of the data
the insertion of suitable defaults
estimation of missing data by modeling
It is important to provide a clean data report containing:
decisions and actions that were taken to resolve data quality issues
the data transformations that took place in this step
possible impact on the result of the project analysis