Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

PLEASE WRITE CODE IN JAVA. PLEASE INCLUDE WHICH WIKI PAGE USED We are going to p

ID: 3918962 • Letter: P

Question

PLEASE WRITE CODE IN JAVA.

PLEASE INCLUDE WHICH WIKI PAGE USED

We are going to play with analyzing the contents of the Wikipedia pages. The Wikipedia pages are in xml format and were downloaded from: In the above directory are articles from Wikipedia. In class, we learned about the idea of stopwords. You can find a list of stopwords at (http://www.ranks.nl/stopwords) Go to the wikipedia page: https:/dumps.wikimedia.org/backup- index.html. Pick any of the wikis to download. A3: Create a program that records the number of occurrences of these stop words. Output the stop word and its occurrence number to a file. A4: Create a program that records all words that are not stop words and the number of occurrences of these words. Be careful if you are using arrays- you might run out of space.

Explanation / Answer

Hii...I have written program against the above 2. I have taken dewiki set in wikipedia.

Program 1:

import java.io.BufferedReader;

import java.io.BufferedWriter;

import java.io.File;

import java.io.FileNotFoundException;

import java.io.FileReader;

import java.io.FileWriter;

import java.io.IOException;

import java.util.ArrayList;

import java.util.Collections;

import java.util.List;

public class Test {

public static void main(String[] args) throws IOException {

// TODO Auto-generated method stub

File f1 = new File("C:/Users/Santhosh/Desktop/dewiki-20180720-pages-articles-multistream-index.txt");

BufferedReader br = new BufferedReader(new FileReader(f1));

String line = br.readLine();

List<String> main = new ArrayList<String>();

while(line!=null){

//System.out.println(line);

String ab[] = line.split(":");

if(ab.length>2){

main.add(ab[2]);

}

line = br.readLine();

}

System.out.println(main.size());

File f2 = new File("C:/Users/Santhosh/Desktop/stopwords.txt");

br = new BufferedReader(new FileReader(f2));

line = br.readLine();

List<String> stopwords = new ArrayList<String>();

while(line!=null){

//System.out.println(line);

stopwords.add(line);

line = br.readLine();

}

br.close();

BufferedWriter bw = new BufferedWriter(new FileWriter(new File("C:/Users/Santhosh/Desktop/output.txt")));

StringBuffer sb = new StringBuffer();

for(String s1:stopwords){

sb.append(s1 + ": " + Collections.frequency(main, s1) +" ");

}

bw.write(sb.toString());

bw.close();

}

}

Program 2:

import java.io.BufferedReader;

import java.io.BufferedWriter;

import java.io.File;

import java.io.FileNotFoundException;

import java.io.FileReader;

import java.io.FileWriter;

import java.io.IOException;

import java.util.ArrayList;

import java.util.Collections;

import java.util.*;

public class Test {

public static void main(String[] args) throws IOException {

// TODO Auto-generated method stub

File f1 = new File("C:/Users/Santhosh/Desktop/dewiki-20180720-pages-articles-multistream-index.txt");

BufferedReader br = new BufferedReader(new FileReader(f1));

String line = br.readLine();

List<String> main = new ArrayList<String>();

while(line!=null){

//System.out.println(line);

String ab[] = line.split(":");

if(ab.length>2){

main.add(ab[2]);

}

line = br.readLine();

}

System.out.println(main.size());

File f2 = new File("C:/Users/Santhosh/Desktop/stopwords.txt");

br = new BufferedReader(new FileReader(f2));

line = br.readLine();

List<String> stopwords = new ArrayList<String>();

while(line!=null){

//System.out.println(line);

stopwords.add(line);

line = br.readLine();

}

br.close();

List<String> collList = new ArrayList<String>();

for(String s1:main){

if(!stopwords.contains(s1)){

collList.add(s1);

}

}

Set<String> set1 = new HashSet<String>();

BufferedWriter bw = new BufferedWriter(new FileWriter(new File("C:/Users/Santhosh/Desktop/output.txt")));

StringBuffer sb = new StringBuffer();

for(String s1:set1){

sb.append(s1 + ": " + Collections.frequency(collList, s1) +" ");

}

bw.write(sb.toString());

bw.close();

}

}

Please test it and let me know any issues. Thanks. All the best.

Hire Me For All Your Tutoring Needs
Integrity-first tutoring: clear explanations, guidance, and feedback.
Drop an Email at
drjack9650@gmail.com
Chat Now And Get Quote