IKH

Web Scraping

  • The process of collecting information from web pages is called web scraping. In web scraping to match our required patterns like mail ids, mobile numbers we can use regular expressions.

Example

Python
import re, urllib
import urllib.request
sites="google rediff".split()
print(sites)
for s in sites:
  print("Searching...",s)
  u=urllib.request.urlopen("http://"+s+".com")
  text=u.read()
  title=re.findall("<title>.*</title>",str(text),re.I)
  print(title[0])

Output

PowerShell
['google', 'rediff']
Searching... google
<title>Google</title>
Searching... rediff
<title>Rediff.com: News | Rediffmail | Stock Quotes | Shopping</title>

Question

Write a program to get all phone numbers of redbus.in by using web scraping and regular expressions.

Python
import re, urllib
import urllib.request
u=urllib.request.urlopen("https://www.redbus.in/info/contactus")
text=u.read()
numbers=re.findall("[0-9-]{7}[0-9-]+",str(text),re.I)
for n in numbers:
  print(n)

Output

PowerShell
65-31582888
2023-11-04
919945600000
919945600000
55161784

Question

Write a python program to check whether the given mail id is valid gmail id or not?

Python
import re
s=input("Enter Mail id:")
m=re.fullmatch("\w[a-zA-Z0-9_.]*@gmail[.]com",s)
if m!=None:
  print("Valid Mail Id")
else:
  print("Invalid Mail id")

Output

PowerShell
Enter Mail id:jisir@gmail.com
Valid Mail Id
PowerShell
Enter Mail id:jkhg
Invalid Mail id

Question

Write a python program to check whether given car registration number is valid telangana state registration number or not?

Python
import re
s=input("Enter Vehicle Registration Number:")
m=re.fullmatch("TS[012][0-9][A-Z]{2}\d{4}",s)
if m!=None:
  print("Valid Vehicle Registration Number")
else:
  print("Invalid Vehicle Registration Number")

Output

PowerShell
Enter Vehicle Registration Number:TS07EA7777
Valid Vehicle Registration Number
PowerShell
Enter Vehicle Registration Number:AP07EA7898
Invalid Vehicle Registration Number

Question

Write a python program to check whether the given mobile number is valid or not?

Python
import re
s=input("Enter Mobile Number:")
m=re.fullmatch("(0|91)?[7-9][0-9]{9}",s)
if m!=None:
  print("Valid Mobile Number")
else:
  print("Invalid Mobile Number")

Output

PowerShell
Enter Mobile Number:9686958679
Valid Mobile Number

Ungraded Questions

Get ready for an exhilarating evaluation of your understanding! Brace yourself as we dive into the upcoming assessment. Your active participation is key, so make sure to attend and demonstrate your knowledge. Let’s embark on this exciting learning journey together!


Name
Email
Phone

Report an error