InterviewSolution
Saved Bookmarks
| 1. |
Does the vocabulary of a corpus remain the same before and after text normalization? Why? |
|
Answer» No, the vocabulary of a corpus does not remain the same before and after text normalization. Reasons are: ● In normalization the text is normalized through various steps and is lowered to minimum vocabulary since the machine does not require grammatically correct statements but the essence of it. ● In normalization Stop words, Special Characters and Numbers are removed. ● In stemming the affixes of words are removed and the words are converted to their base form. So, after normalization, we get the reduced vocabulary. |
|