The English linguist John Firth had said in 1957-‘ You shall know a word by the company it keeps’.
In a previous lecture, you had studied the concept of Distributional semantics briefly- words which appear in the same context have similar meanings. This similar idea has probably been the most powerful and useful and useful insight in creating semantic processing systems. You will now learn study this idea in detail and learn to use it for various semantic processing applications.
To summarise, the basic idea that we want to use to quantify the similarity between words is that word which occur in similar context are similar to each other. To do that , we need to represent word in a formal which encapsulates its similarity with other words. For e.g. in such a representation of words , the terms ‘greebel’ and ‘train’ will be similar to each other.
The most commonly used representation of words is using ‘word vectors’. There are two broad
techniques to represent words as vectors:
- The term- document used occurrence matrix , where each row is a tern in the vocabulary and each column is a document (such as a webpage , tweet , book etc.)
- The term-term co-occurrence matrix, where the ith row and jth column represents the occurrence of the ith word in the context of the jth word.