|
Answer» The following are some metrics on which NLP models are evaluated: - Accuracy: When the output variable is categorical or discrete, accuracy is used. It is the percentage of correct predictions made by the model compared to the TOTAL number of predictions made.
- PRECISION: Indicates how precise or EXACT the model's predictions are, i.e., how MANY positive (the class we care about) examples can the model correctly identify given all of them?
- Recall: Precision and recall are complementary. It measures how effectively the model can recall the positive class, i.e., how many of the positive predictions it generates are correct.
- F1 score: This metric COMBINES precision and recall into a single metric that also represents the trade-off between accuracy and recall, i.e., completeness and exactness.
(2 Precision Recall) / (Precision + Recall) is the formula for F1. - AUC: As the prediction threshold is changed, the AUC captures the number of correct positive predictions versus the number of incorrect positive predictions.
|