Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

3. Implement a web crawler that uses breadth-first traversal rather than depth-f

ID: 3834137 • Letter: 3

Question

3. Implement a web crawler that uses breadth-first traversal rather than depth-first traversal. Unlike depth-first traversal, breadth-first traversal is not naturally implemented using recursion. Instead iteration and a queue are used. The purpose of the queue is to store URLs that have been discovered but not yet visited. Initially, the queue will contain the starting web page only, which is the only URL that is discovered at that point. In every iteration of a while loop, the queue is dequeued to obtain a URL, and then the associated web page is visited. Any link in the visited page with a URL that has not been visited or discovered is then added to the queue. The while loop should iterate as long as there are discovered but unvisited URLs, that is, as long as the queue is not empty Complete the implementation of the breadth-first web crawler by writing the crawl method in the Crawler2 class found in the zip file. The following shows what the implemented Crawler2 class would do on several sample URLs. Please note that your function must work on all possible examples, not just the ones given below. Note that no part of this question will involve recursion: Python 360 Shell Eile Edit Shell Debug Qptions Window Help Crawler http://facweb.cdm.depaul.edu/asettle/csc242/web/one.html. C. crawl visiting http://facweb.cdm.depaul.edu/aaettle/cac292/web/one.html. visiting http://facvreb.cdm.depaul.edu/asettle/csc242/web/three.html. visiting http://facweb.cdm. depaul. visiting http://facweb.cdm.depaul.edu/asettle/csc242/web/four.html .html. Visiting http://facweb.cdm.depaul.edu/asettle/csc24 reset c crawl. http://facweb.cdm. depaul.edu/asettle/csc242/web/two.html visiting http://facweb.cdm. depaul. visiting http://facweb.cdm.d edu/aaettle/cac292/web/tour .html. visiting http://factweb.cdm.depaul.edu/asettle/csc242/web/five .html. visiting http://facweb.cdm. depaul. .edu/ asettle .html. visiting http://facweb.cdm.depaul.edu/aaettle/csc212/web/three .html c .crawl http://facweb.cdm. depaul.edu/aaettle/cac242 /web/three .html. Visiting http://facweb.cdm.depaul.edu/asettle/csc252/web/three.html visiting http://facweb.cdm. depaul. .html. visiting http://facweb.cdm.depaul.edu/aaettle/cac292/web/tive .html visiting http://factweb.cdm.depaul.edu/asettle/csc242/ b/one html visiting http://facweb.cdm. depaul. C. reset http://facweb.cdma-depaul .edu/asettle/csc242/web/four .html. visiting http://facweb.cdm.d edu/aaettle/csc242/web/four .html. Visiting http://facweb.cdm.depaul.edu/asettle/csc252/web/five.html visiting http://facweb.cdm. depaul. visiting http://facweb.cdm.depaul.edu/aaettle/cac292/web/two.html. Visiting http://factoreb.cdm.depaul.edu/asettle/csc242/web/three.html Lin: 33 Col: 4

Explanation / Answer

import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.ResultSet;
import java.sql.SQLException;
import java.sql.Statement;

public category sound unit association guide = null;

   public DB() catch (ClassNotFoundException e)
   }

   public ResultSet runSql(String sql) throws SQLException {
       Statement sta = guide.createStatement();
       return sta.executeQuery(sql);
   }

   public mathematician runSql2(String sql) throws SQLException {
       Statement sta = guide.createStatement();
       return sta.execute(sql);
   }

   @Override
   protected void finalize() throws Throwable !conn.isClosed()) {
           conn.close();
      
   }
}
4). produce a category with name "Main" which can be our crawler.

import java.io.IOException;
import java.sql.PreparedStatement;
import java.sql.ResultSet;
import java.sql.SQLException;
import java.sql.Statement;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;


public category Main {
   public static sound unit sound unit = new DB();

   public static void main(String[] args) throws SQLException, IOException URL is already in info
       String sql = "select * from Record wherever computer address = '"+URL+"'";
       ResultSet rs = sound unit.runSql(sql);
       if(rs.next())else{
           //store the computer address to info to avoid parsing once more
           sql = "INSERT INTO `Crawler`.`Record` " + "(`URL`) VALUES " + "(?);";
           PreparedStatement stmt = sound unit.conn.prepareStatement(sql, Statement.RETURN_GENERATED_KEYS);
           stmt.setString(1, URL);
           stmt.execute();

           //get helpful info
           Document doc = Jsoup.connect("http://www.mit.edu/").get();

           if(doc.text().contains("research"))

           //get all links and recursively decision the processPage methodology
   components queries = doc.select("a[href]");
           for(Element link: questions)

Hire Me For All Your Tutoring Needs
Integrity-first tutoring: clear explanations, guidance, and feedback.
Drop an Email at
drjack9650@gmail.com
Chat Now And Get Quote