IKH

What’s Next in Semantic Processing?

The previous two sessions (Introduction to Semantic Processing and Distributional Semantics) comprise of the main module on Semantic Processing. The next two sessions, Topic Modelling and Social Media Opinion Mining  – Semantic Processing Case Study, are an optional part of this course.

The concepts covered in the two optional sessions are explained here briefly. You can study these modules at your own pace. 

Session-3: Topic Modelling

Topic modelling is the task of identifying key ‘topics’ being talked about in the text. 

For example, say you are a product manager at Amazon and want to understand what features (‘topics’) of a recently released product (say Amazon Alexa) customers are talking about in their reviews. Similarly, say you have a large set of documents (e.g. research papers, news articles, blog posts etc), and you want to identify ‘topics’ contained in each document such as ‘diabetes’, ‘movies’, ‘astronomy’ etc. 

Topic modelling can be applied to a wide variety of text documents such as tweets, books, scientific articles etc. In this session, you will study various models used for topic modelling in detail and learn to build topic models in Python. Specifically, you will learn:

  • Introduction to topic models
  • Probabilistic Latent Semantic Analysis (PLSA)
  • Latent Dirichlet Allocation (LDA)
  • Building Topic Models in Python: Amazon Product Reviews, Demonetisation Tweets

Session-4:
Social Media Opinion Mining  – Semantic Processing Case Study

In this session, you will learn to build a hands-on application to analyse ‘social-media narratives’ using tweets. Social media analysis is one of the fastest growing areas in semantic processing (and NLP in general), with companies wanting to understand users’ opinion of their products, political parties wanting to understand public sentiment etc.

In this session, you will learn to build an application which analyses the various public narratives (or opinions) on a given controversial topic (such as demonetisation). Specifically, the techniques you will use to build the application are:

  • Modelling ‘social cognition’ through narratives and opinions on Twitter
  • Training word vectors using tweets
  • Creating ‘tweet vectors with a sentiment score’ using word vectors
  • Clustering the documents (tweets)  
  • Identifying key ‘entities’ in tweets: Applying POS tagging and chunking to tweets
  • Interpreting results, i.e. public narratives on a topic (e.g. demonetisation)

This project uses multiple concepts you’ve been taught throughout the Natural Language Programming (NLP) course – POS tagging, chunking, word vectors, sentiment etc. You will also see the Zipf’s law (or the power law) in action.