Until now, we have been discussing constituency parsing where groups of words or constituencies comprise of the basic structure of a parse tree. In this section, we will introduce an alternate paradigm of grammar called dependency grammar and related dependency parsing techniques.
In dependency grammar, constituencies (such as NP, VP etc.) do not form the basic elements of grammar, but rather dependencies are established between the words themselves. For example, consider the following dependency parse tree of the sentence “man saw dogs” created using the displaCy dependency visualiser:
The dependencies can be read as follows: ‘man’ is the subject of the sentence (the one who is doing something); ‘saw’ is the main verb (something that is being done); while ‘dogs’ is the object of ‘saw’ (to whom something is being done).
Notice that there is no notion of phrases or constituencies, but rather relationships are established between the words themselves.
The basic idea of Dependency Parsing is based on the fact that each sentence is about something, and usually contains a subject (the doer), a verb (what is being done) and an object (to whom something is being done).
In general, Subject-Verb-Object (SVO) is the basic word order in present-day English (which is said to follow a ‘rigid word order’ form – more on that in the lecture). Of course, many sentences are far more complex to fall into this simplified SVO structure, though sophisticated dependency parsing techniques are able to handle most of them.
Fixed and Free-Word-Order Languages
Modern languages can be divided into two broad types – fixed-word order and free-word order.
To understand the nature of languages, consider the following English sentences:
The labourers built a strong wall.
Subject Verb Object
The professor taught NLP to the entire class.
Subject Verb Object
Like English, many languages follow the SVO word order, as shown in the examples above. Such languages are called fixed-word order languages.
On the other hand, consider the following Hindi sentences:
The two sentences in Hindi have the same meaning though they are written in two different word orders (SOV, OSV). There are fewer languages like Hindi (such as Spanish) which allow a free-order of words.
Prof. Srinath will explain the concept of ‘free’ and ‘fixed’ word order languages, why CFGs cannot handle free word order languages, and the concept of ‘dependencies’.
You saw that free word order languages such as Hindi are difficult to parse using constituency parsing techniques. This is because, in such free-word-order languages, the order of words/constituents may change significantly while keeping the meaning exactly the same. It is thus difficult to fit the sentences into the finite set of production rules that CFGs offer.
You also saw how dependencies in a sentence are defined using the elements Subject-Verb-Object.
The following table shows examples of three types of sentences – declarative, interrogative, and imperative:
Declarative | Shyam complimented Suraj Subject Verb Object |
Interrogative | Will the teacher take the class today? Aux Subject Object (Aux: auxiliary verbs such as will, be, can) |
Imperative | Stop the car! Verb Object |
Universal Dependencies
Apart from dependencies defined in the form of subject-verb-object, there’s a non-exhaustive list of dependency relationships. Let’s look at how a dependency parse structure looks like and how dependencies are established among the words using what are called universal dependencies.
Elements of Dependency Grammar
Let’s understand Dependency Parsing in a little more detail using some examples. Consider the declarative sentence: “The man jumped from the moving train into the river”.
In a dependency parse, we start from the root of the sentence, which is often a verb. In the example above, the root is the word ‘jumped’. The intuition for the root is that it is the main word that describes the ‘aboutness’ of a sentence. Although the sentence is also about ‘The man’, ‘the moving train’ and ‘the river’, it is most strongly about the fact someone ‘jumped’ from somewhere into something.
As you saw in the lecture, dependencies are represented as labelled arcs of the form h → d (l) where ‘h’ is called the “head” of the dependency, ‘d’ is the “dependent” and l is the “label” assigned to the arc.
There is a non-exhaustive list of dependency roles. The most commonly used labels are mentioned in the following downloadable document.
You can read more about labels from the following URL.
Please go through the document so that it’ll be easier for you to understand the dependencies mentioned later. Let’s now understand the dependency graph using a small sentence.
Sentence: “The fat man ate cake.”
The root verb, also called the head of the sentence, is the verb ‘ate’ since it describes what the sentence is about. All the other words are dependent on the root word ‘ate’, as shown by the arcs directed from ‘ate’ to the other words.
The dependencies from the root to the other words are as follows :
- nsubj (man) is the nominal subject of the verb ‘ate’.
- The word ‘fat’ modifies the word ‘man’ and is the adjective modifier (amod) of nsubj.
- The word ‘the’ is a determiner associated with the word ‘man’.
- The direct object of the verb (dobj) is ‘cake’.
Let’s now understand the dependency parse tree for the sentence shown in the lecture:
Sentence: “Economic news had little effect on financial markets”.
You can visualise the dependency parse of this sentence here. Also, in the parse shown below, we have merged the phrases such as ‘Economic news’, ‘little effect’ etc.
Let’s identify the role of each word one by one, starting with the root verb.
- The word ‘had’ is the root.
- The phrase ‘Economic news’ is the nominal subject (nsubj).
- The phrase ‘little effect’ is the direct object (dobj) of the verb ‘had’.
- The word ‘on’ is a preposition associated with ‘little effect’.
- The noun phrase ‘financial markets’ is the object of ‘on’.
Let’s now look at the role of each word in the parse.
- The word ‘Economic’ is the modifier of ‘news’. This is represented as:
news -> amod -> economic
- The words ‘financial’ and ‘little’ modify the words ‘markets’ and ‘effect’ respectively
effect -> amod -> little
markets -> amod -> financial
- The two words ‘on’ and ‘markets’ have no incoming arcs. The word ‘on’ is dependent on the word ‘effect’ as a nominal modifier:
effect -> prep -> on
- The word ‘markets’ is an object of the word ‘on’:
on -> pobj -> markets
Let’s try solving some question to understand it better.
Dependency Parsing – Optional Content
Dependency parsing is a fairly advanced topic whose study involves a much deeper understanding of the English grammar and parsing algorithms than what this course can offer.
The topics covered in optional content should provide you with sufficient background to study further advanced topics, such as the one provided below.
Additional Reading
- You can read more on Dependency Parsing from Chapter 13, Dependency Parsing, Speech and Language Processing. Daniel Jurafsky & James H. Martin.