IKH

Summary

In this session, you learnt about the different areas where text analytics is applied such as healthcare, e-commerce, retail, financial and various other industries. Then you learnt about the stack that is generally followed to extract insights from the text and to build various applications of natural language processing. You learn there are three stages in text analytics:

  • Lexical processing
  • Syntactic processing
  • Semantic processing

Then you learnt about text encoding and its various types such as ASCII and Unicode. You learnt how to change between different types of Unicode encodings in Python.

Then you learnt about regular expressions. You learnt how to manipulate and extract the information that you want from a given text corpus using regular expressions. In regular expressions, you learnt about quantifiers, their different types and how they are used to mention the number of times a character(s) is present. You learnt about the the anchor characters (^ and $) and the wildcard (.). Then you learnt about the character sets and meta-sequences which are shorthand for common characters sets. You then learnt about the types of searches – greedy and non-greedy and how they differ. You also learnt the use of grouping characters in a regular expression. Finally you looked at the different types of functions that are present in Python to facilitate the use of regular expressions in practical settings.

Finally, you can refer to this link whenever you want a refresher in regular expressions in Python. There are some of the concepts that we’ve left untouched in regular expressions. But as someone who is working in the area of text analytics, you can achieve pretty much everything using the tools that you have learnt.

In the next section, you’ll attempt the graded questions to test your learning.

Report an error