Since the beginning of Natural Language Processing (NLP), there has been the need to transform text into something machines can understand.
That is, transforming text into a meaningful vector (or array) of numbers. The standard way of doing this is to use a bag of words approach.
CountVectorizer & TfidfVectorizer are the 2 ways in which text can be converted to numbers!
In this video, I’ll try to explain the impact of changing Term Frequency and Inverse Document Frequency on the overall vector generated.
I hope you all like it 🙂