- If we want to represent a group of strings according to a particular format/pattern then we should go for regular expressions i.e. regular expressions is a declarative mechanism to represent a group of strings according to particular format/pattern.
- The main important application areas of regular expressions are:
- To develop validation frameworks/validation logic.
- To develop pattern matching applications.
- To develop Translators like compilers, interpreters etc.
- To develop communication protocols like TCP/IP, UDP etc.
Character Classes
- We can use character classes to search a group of characters.
- [abc] ⇒ Either a or b or c.
- [^abc] ⇒ Except a and b and c.
- [a-z] ⇒ Any lower case alphabet symbol.
- [A-Z] ⇒ Any upper case alphabet symbol.
- [a-zA-Z] ⇒ Any alphabet symbol.
- [0-9] ⇒ Any digit from 0 to 9.
- [a-zA-Z0-9] ⇒ Any alphanumeric character.
- [^a-zA-Z0-9] ⇒ Except alphanumeric characters (special characters).
Note
- finditer() – It is a predefined function in re module that returns an iterator object which yield match object for every match.
Question
Select all alphabets from given string.
Python
import re
matcher=re.finditer("[abc]","a7b@k9z")
for match in matcher:
print(match.start(),"......",match.group())
Output
PowerShell
0 ...... a
2 ...... b
4 ...... k
6 ...... z
Pre Defined Character Classes
- \s ⇒ Space character.
- \S ⇒ Any character except space character.
- \d ⇒ Any digit from 0 to 9.
- \D ⇒ Any character except digit.
- \w ⇒ Any word character [a-zA-Z0-9].
- \W ⇒ Any character except word character (special characters).
- . ⇒ Any character including special characters.
Example
Python
import re
matcher=re.finditer("\S","a7b k@9z")
for match in matcher:
print(match.start(),"......",match.group())
Output
PowerShell
0 ...... a
1 ...... 7
2 ...... b
4 ...... k
5 ...... @
6 ...... 9
7 ...... z
Quantifiers
- We can use quantifiers to specify the number of occurrences to match.
- Following quantifiers are possible:
- a ⇒ Exactly one a.
- a+ ⇒ Atleast one a.
- a* ⇒ Any number of a including zero number.
- a? ⇒ Atmost one a i.e. either zero number or one number.
- a{m} ⇒ Exactly m number of a.
- a{m,n} ⇒ Minimum m number of a and Maximum n number of a.
Example
Python
import re
matcher=re.finditer("a+","abaabaaab")
for match in matcher:
print(match.start(),"......",match.group())
Output
PowerShell
0 ...... a
2 ...... aa
5 ...... aaa
Note
- ^x ⇒ It will check whether target string starts with x or not.
- x$ ⇒ It will check whether target string ends with x or not
Ungraded Questions
Get ready for an exhilarating evaluation of your understanding! Brace yourself as we dive into the upcoming assessment. Your active participation is key, so make sure to attend and demonstrate your knowledge. Let’s embark on this exciting learning journey together!