IKH

Introduction

Welcome to the module on Syntactic Processing. The last module on Lexical Processing focussed on text preprocessing and feature extraction, in which you had learnt the following techniques:

  • Regular expressions.
  • Tokenisation, Stemming, Lemmatisation.
  • TF-IDF model
  • Phonetic hashing.
  • The minimum edit distance algorithm.

You also learnt to build a spam detector and spell corrector in the module.

In this module, you will learn algorithms and techniques used to analyse the syntax or the grammatical structure of sentences. In the first session, you will learn the basics of grammar (part-of-speech tags etc.) and write your own algorithms such as HMMs to build POS taggers. In the second session, you will study algorithms to parse the grammatical structure of sentences such as Context-Free Grammar, Probabilistic CFGs, and dependency parsing. Finally, in the third session, you will learn to build an Information Extraction (IE) system to parse flight booking queries for users using techniques such as Named Entity Recognition (NER). You will also study a class of models called Conditional Random Fields (CRFs) which are widely used for building NER systems.

All these techniques fall under what is called syntactic processing.  

Syntactic processing is widely used in applications such as question answering systems, information extraction, sentiment analysis, grammar checking.

In this session

This session will introduce you to the following topics:

  • The What and Why of Syntactic Processing.
  • Basics of Grammar and Parsing.
  • Algorithms for Part of Speech Tagging and Hidden Markov Models.

People you will hear from in this session:

Subject Matter Expert

Srinath Srinivasa

Professor and Dean, R&D, IIIT- B

The International Institute of Information Technology, Bangalore, also known as IIIT-B, is one of India’s foremost graduate schools. Through its Integrated M.Tech., M.Tech., M.S. (Research) and PhD programs in the IT space, it focuses equally on innovation and education.

Report an error