InterviewSolution
| 1. |
What is regularization and why it is helpful in the context of data science? |
|
Answer» Response: The process of adding a tuning parameter to a model or algorithm to induce smoothness to prevent and ADDRESS overfitting issues is called "Regularization". Regularization term is added to a mathematical equation to prevent the coefficients to fit perfectly,AVOIDING the risk of overfitting. This is PRIMARILY performed by including a constant multiple to an existing weight vector. This constant is often either the L1 (Lasso) or L2 (RIDGE), however, it can in actuality get into any norm. The model predictions should then minimize the mean of the LOSS or error function calculated on the regularized training set. L1 or Lasso regularization helps perform feature selection in sparse feature spaces, and that is a good practical reason to use L1 in some situations. However, beyond that particular reason, L1 may not perform better than L2 in practice. Even in a situation where you might benefit from L1's sparsity to do feature selection, using L2 on the remaining variables is likely to give better results than L1 by itself. |
|