A key task in syntactical processing is parsing. It means to break down a given sentence into its ‘grammatical constituents’. Parsing is an important step in many applications which helps us better understand the linguistic structure of sentences.
Let’s understand parsing through an example. Let’s say you ask a question answering (QA) system, such as Amazon’s Alexa or Apple’s Siri, the following question: “Who won the cricket world cup in 2015?”
The QA system can respond meaningfully only if it can understand that the phrase ‘cricket world cup’ is related to the phrase ‘in 2015’. The phrase ‘in 2015’ refers to a specific time frame, and thus modifies the question significantly. Finding such dependencies or relations between the phrases of a sentence can be achieved using parsing techniques.
Let’s take another example sentence to understand how a parsed sentence looks like: “The quick brown fox jumps over the table”. The figure given below shows the three main constituents of this sentence. Note that actual parse trees are different from the simplified representation below.
This structure divides the sentence into three main constituents:
- ‘The quick brown fox’ is a noun phrase .
- ‘jumps’ is a verb phrase.
- ‘over the table’ is a prepositional phrase.
You will study elements of grammar and parsing techniques in the segments that follow. Prof. Srinath will introduce you to the following different levels of syntactical analysis:
- Part-of-speech tagging.
- Constituency parsing.
- Dependency parsing.
Let’s understand the levels of syntax analysis using an example sentence: “The little boy went to the park.”
POS tagging is the task of assigning a part of speech tag (POS tag) to each word. The POS tags identify the linguistic role of the word in the sentence. The POS tags of the sentence are:
The | little | boy | went | to | the | park |
Determinant | Adjective | Noun | Verb | Preposition | Determinant | Noun |
Constituency parsers divide the sentence into constituent phrases such as noun phrase, verb phrase, prepositional phrase etc. Each constituent phrase can itself be divided into further phrases. The constituency parse tree given below divides the sentence into two main phrases : a noun phrase and a verb phrase. The verb phrase is further divided into a verb and a prepositional phrase, and so on.
Dependency Parsers do not divide a sentence into constituent phrases, but rather establish relationships directly between the words themselves. The figure below is an example of a dependency parse tree of the sentence given above (generated using the spaCy dependency visualiser). In this module, you’ll understand when dependency parsing is more useful than constituency parsing and study the elements of dependency grammar.
You will study these parsing techniques in the sections that follow. In the next few segments, you will study POS tagging in detail.