IKH

Commonly Used RE Function

Till now you’ve seen only one function of the ‘r’ module, that is, the ‘re.search0 ‘ function. While it is a very common function used while working with regular expression in python, it is not the only function that you’d use while working with regular expressions.

You’re going to learn to learn about four more functions in this section where Krishna explains one-by-one. Let’s look at the other function.

You learnt about the match function and the search function. The match function will only match if the pattern is present at the very start of the string. On the other hand, the search function will look for the pattern starting from the left of the string and keeps searching until it sees the pattern and then returns the match.

The next function that you’re going to study is the re.sub() function. It is used to substitute a substring with another substring of your choice. 

Regular expression patterns can help you find the substring in a given corpus of text that you want to substitute with another string. For example, you might want to replace the American spelling ‘color’ with the British spelling ‘colour’. Similarly, the re.sub() function is very useful in text cleaning. It can be used to replace all the special characters in a given string with a flag string, say, SPCL_CHR, just to represent all the special characters in the text.

Next, Krishna will explain how to use the re.sub() function.

The re.sub() function is used to substitute a part of your string using a regex pattern. It is often the case when you want to replace a substring of your string where the substring has a particular pattern that can be matched by the regex engine and then it is replaced by the re.sub() command. 

Note that, this command will replace all the occurrences of the pattern inside the string. For example, take a look at the following command:

PowerShell
pattern = "\d"
replacement = "X"
string = "My address is 13B, Baker Street"

re.sub(pattern, replacement, string)

It will change the string to:” My address is XXB, Baker Street”

The next set of function let you search the entire input string and return all the matches, in case there are more than one present . in the following the function.

To summarise, the match and search command return only one match. But you often need to extract all the matches rather than only the first match, and that’s when you use the other methods.

Suppose, in a huge corpus of text, you want to extract all the dates, in that case you can use the finditer() function or the findall() function to extract the results. The result of the findall() function is a list of all the matches and the finditer() function is used in a ‘for’ loop to iterate through each separate match one by one.

The following questions will help you practice these functions.

In the next section, you’ll learn about grouping a regular expression into different parts.

Report an error