Write a program that gets the root document (\"/\") from www.champlain.edu (Link
ID: 3850762 • Letter: W
Question
Write a program that gets the root document ("/") from www.champlain.edu (Links to an external site.)Links to an external site. and extracts every line that has an anchor tag, indicating that it's a hyperlink to another page. The anchor tag will start with <a so you can search for that. Ideally, include a space after the a so you only get the anchor tags.
Create a Sqlite database sending in the following query at the beginning of your program if the database doesn't exist (use the os module to check to see if a file exists).
CREATE TABLE storage(curtime TEXT, line TEXT)
You can use the following query to insert values into the database:
INSERT INTO storage (curtime, line) values ('', line);
With appropriate values for your values. You will need to create a string by concatenating your values before you pass the INSERT statement into the execute method. There are a number of ways to do this, so be sure to look up examples online.. You should be getting the current time each time you enter a value into the database and pass that in for the value of curtime. The following will get you the current time with high precision.
import datetime
current_time = datetime.datetime.now().time()
Use all the same documentation requirements you have been using and submit your program here.
Currently using Python 3.5.3. having trouble with the part below:
CREATE TABLE storage(curtime TEXT, line TEXT)
You can use the following query to insert values into the database:
INSERT INTO storage (curtime, line) values ('', line);
With appropriate values for your values. You will need to create a string by concatenating your values before you pass the INSERT statement into the execute method. There are a number of ways to do this, so be sure to look up examples online.. You should be getting the current time each time you enter a value into the database and pass that in for the value of curtime. The following will get you the current time with high precision.
Explanation / Answer
Solution:
# import required header files
from BeautifulSoup import BeautifulSoup, SoupStrainer
import httplib2
import sqlite3
import datetime
import os.path
# OS module to check DB existence
if not os.path.isfile('champlain.db'):
cntn=sqlite3.connect('champlain.db')
cur=cntn.cursor()
# Query to create a table
cur.execute('CREATE TABLE storage(curtime TEXT, line TEXT)')
# get the content from website
httpurl = httplib2.Http()
urlstatus, urlresponse = httpurl.request('http://www.champlain.edu')
for linkurl in BeautifulSoup(urlresponse, parseOnlyThese=SoupStrainer('a')):
if linkurl.has_attr('href'):
lnk=[]
# to get current time
current_time = datetime.datetime.now().time()
lnk.append(str(current_time))
lnk.append(linkurl['href'])
# code to update the value into table
cur.execute('Insert or ignore into storage values(?,?)',lnk)
# update the changes
cntn.commit()
cntn.close()
Related Questions
drjack9650@gmail.com
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.