In this session, you learnt the two most commonly used paradigms of parsing – constituency parsing and dependency parsing.
In constituency parsing, you learnt the basic idea of constituents as grammatically meaningful groups of words, or phrases, such as noun phrase, verb phrase etc. You also learnt the idea of context-free grammars or CFGs which specify a set of production rules. Then, you learnt two broad approaches to constituency parsing:
- Top-down parsing.
- Bottom-up parsing
You also learnt how left-recursion in grammar rules causes the top-down parser to run into an infinite loop. The alternative, in this case, is bottom-up parsing. You learnt that the shift-reduce algorithm can be used for bottom-up parsing.
You also learnt that the parsers based solely on CFGs often generate multiple parses of sentences, though only some of them are likely to occur in the real world. To deal with such ambiguous sentences, you learnt to exploit the idea of probabilities using probabilistic context-free grammars or PCFGs. The probabilities associated with each rule help the algorithm decide the most probable parse tree of an ambiguous sentence.
Then you studied how to convert any CFG to the Chomsky Normal Form (CNF). The CNF helps reduce the theoretically wide range of possible grammars into a standardised form, thereby resulting in convenience in writing parsing algorithms.
Finally, you learnt dependency parsing which is based on an alternative paradigm of grammar called dependency grammar. You learnt that dependency parsing is useful in dealing with free-word-order languages while constituency parsing techniques are confined to performing well only on fixed-word-order languages. You learnt the basic elements of dependency parse grammar such as subject, verb, object etc.
In the next session, you will learn to apply your syntactic analysis techniques and learn to build a natural language flight-booking-system. In the process, you will learn NLP techniques formally known as Information Extraction, Named Entity Recognition and sophisticated sequence modelling techniques such as Conditional Random Fields.