1.

What impact outliers have in a dataset? Explain with an example.

Answer»

Outliers can have a significant impact BASED on the results of the data analysis and statistical modeling. These are as follows:

  • Outliers can decrease normality as they are non-randomly distributed
  • Error variance INCREASES with a relative COMPARISON and that provides an incorrect estimate of the overall population.
  • Power of statistical tests are also reduced because of the impact in standard deviation.
  • ANOVA, different relevant statistical model ASSUMPTIONS are impacted.

Here is an example with a sample dataset.

Without Outlier
With Outlier

Dataset: 1,1,2,2,2,2,3,3,3,4,4

Mean = 2.45

Median = 2.00

Mode = 2.00

Standard deviation = 1.035

Dataset: 1,1,2,2,2,2,3,3,3,4,4,200

Mean = 18.91

Median = 2.50

Mode = 2.00

Standard deviation = 57.03

If we look at above, inclusion of an outlier shows huge difference in mean / average and standard deviation PARAMETERS.



Discussion

No Comment Found

Related InterviewSolutions