InterviewSolution
| 1. |
What are outliers? |
|
Answer» Outliers are data points/values that are very FAR from the group. These do not belong to any particular group/cluster. The PRESENCE of outliers may affect the behavior of the MODEL. So proper care is to be taken to identify and properly treat the outliers. The outliers may contain valuable and often useful information. So they should be handled very CAREFULLY. Most of the time, they are considered to be bad data points but their presence in the data set should also be investigated. Outliers present in the input data may skew the result. They may mislead the process of training of machine learning algorithms. This results in:
It is observed that many machine learning models are sensitive to:
The presence of outliers may create misleading representations. This will lead to misleading interpretations of the collected data. As in descriptive statistics, the presence of outliers may skew the mean and standard deviation of the attribute values The effects can be observed in plots like scatterplots and histograms. For some problems, outliers can be more relevant. For example anomalies in:
Some of the outlier detection methods are:
|
|