Question 1: Retrieve the correct course list, with both number and title (parsin
ID: 3914442 • Letter: Q
Question
Question 1: Retrieve the correct course list, with both number and title (parsing)
question 2: Ask User to enter a word to search for - Displays only the courses with the given word.
PLEASE do not use modules like "BeautifulSoup"
so far this is what I have but it is inccorect:
import re
import webbrowser
import urllib.request, ssl
#define the function
def course():
context = ssl._create_unverified_context()
#open site
webpage = urllib.request.urlopen('https://www.sice.indiana.edu/undergraduate/courses/index.html', context = context)
webcontent = webpage.read().decode(errors = 'replace')
webpage.close()
#check the head and body to locate courses
head = re.findall('(?<=<head).+?(?=</head>)',webcontent,re.DOTALL)[0]
body = re.findall('(?<=<body).+?(?=</body>)',webcontent, re.DOTALL)[0]
print("Parsing: https://www.sice.indiana.edu/undergraduate/courses/index.html")
#user_input = input("Please enter a word to search for:")
## print('Head:',head)
## print('Body:',body)
# courses = re.findall('(?<=<meta content=")courses.+?(?= "keywords"/>)',webcontent,re.DOTALL)
courses = re.findall('(?<=<ul class="accordion no-bullet">).+?(?=</ul>)',webcontent,re.DOTALL)[0]
#<ul><li><a href="computer-science.html">Computer Science Courses</a></li>
#courses = re.findall('(?<=<h4)+INFO[w.-](?=</h4>)',webcontent,re.DOTALL)
print(courses)
for i in courses:
print(" ",i," ")
Explanation / Answer
Hi i have gone through the code there were few mistakes i have corrected it. I also stored all the course name into list now on top of that you can implement any search algorithm. Below is my code.
Related Questions
drjack9650@gmail.com
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.