Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

Using Python 2.7 Write a program that gets the root document (\"/\") from www.ch

ID: 3850400 • Letter: U

Question

Using Python 2.7 Write a program that gets the root document ("/") from www.champlain.edu (Links to an external site.)Links to an external site. and extracts every line that has an anchor tag, indicating that it's a hyperlink to another page. The anchor tag will start with <a so you can search for that. Ideally, include a space after the a so you only get the anchor tags.

Create a Sqlite database sending in the following query at the beginning of your program if the database doesn't exist (use the os module to check to see if a file exists).

CREATE TABLE storage(curtime TEXT, line TEXT)

You can use the following query to insert values into the database:

INSERT INTO storage (curtime, line) values ('', line);

With appropriate values for your values. You will need to create a string by concatenating your values before you pass the INSERT statement into the execute method. There are a number of ways to do this, so be sure to look up examples online.. You should be getting the current time each time you enter a value into the database and pass that in for the value of curtime. The following will get you the current time with high precision.

import datetime

current_time = datetime.datetime.now().time()

Explanation / Answer

import urllib, re, datetime, sqlite3

def connect_sqlite(dbname, tablename):
    # SQLITE3 settings
    conn = sqlite3.connect(dbname)
    conn.text_factory = str
    # Get cursor
    c = conn.cursor()
    # Cretae table if not present
    try:
        c.execute("CREATE TABLE " + tablename + "(curtime TEXT, line TEXT)")
    except sqlite3.OperationalError:
        pass
    return conn, c

def close_sqlite(connection):
    connection.close()

def read_hrefs():
    url = "http://www.champlain.edu"
    link = ""
    hrefs_arr = []

    # Open url.
    url_data = urllib.urlopen(url)

    for line in url_data.readlines():
        # Search for anchor tags.
        m = re.search(r'.*?(<a .*?>.*?</a>).*', line)
        try:
            link = m.group(1)
            current_time = datetime.datetime.now().time()
            hrefs_arr.append({ "link" : link, "time" : current_time })
        except AttributeError:
            # If no match found.
            pass

    url_data.close()
    return hrefs_arr

def insert_into_db(cursor, hrefs_arr, tablename):
    for href in hrefs_arr:
        cursor.execute("""INSERT INTO %s (curtime, line) values (?, ?)"""%tablename, (href["link"], str(href["time"])))

# ==== MAIN ====
DB = 'example.db'
TABLE = 'storage'

all_anchors = read_hrefs()
connection, cur = connect_sqlite(DB, TABLE)
insert_into_db(cur, all_anchors, TABLE)
close_sqlite(connection)

Hire Me For All Your Tutoring Needs
Integrity-first tutoring: clear explanations, guidance, and feedback.
Drop an Email at
drjack9650@gmail.com
Chat Now And Get Quote