InterviewSolution
| 1. |
What is cross-validation? What is meant by 5-fold cross-validation? |
|
Answer» Response: Cross-validation (CV) is a technique used to validate machine LEARNING models. The data SET is DIVIDED into training and test datasets. The model is created BASED on the training dataset and trained on that. It is then used to validate with some new dataset which is a test dataset. Cross-validation is a technique for asserting how results or outcomes of a statistical analysis on a given dataset will generalize to an independent dataset. A sample representation can be illustrated below. Here training and test data are shuffled randomly to create multiple flavours for various iterations. The objective of a CV is to test a model's ability to predict new data that was not used while training the model or estimating the model, to help identify issues such as overfitting or bias etc. Hence the model can be generalized by using certain approaches once we perform CV tests. 5-fold CV is nothing but CVs covering 5 iterations. This could be represented or illustrated by the below image. |
|