Design, write and test a java program TextProcessing.java doing the following ta
ID: 3545966 • Letter: D
Question
Design, write and test a java program TextProcessing.java doing the following tasks.
a. Store all input file words (including duplicated words) larger than 3 characters in an appropriate JCF data structure ds1. Justify your option. Display ds1 content.
b. Remove all duplicated words from ds1 and store the remaining non-duplicated words in an appropriate JCF data structure ds2. Justify your option. Display ds2 content.
Note. For example if ds1 contains the words (gamma alpha beta gamma beta) then ds2 should contain (in any order) the words (beta gamma alpha).
c. Sort ds2 words in lexicographical order and store the sorted words in an appropriate JCF data structure ds3. Justify your option. Display ds3 content.
Explanation / Answer
import java.io.File;
import java.io.FileNotFoundException;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.HashMap;
import java.util.Scanner;
public class TextProcessing {
public static void main(String[] args) throws FileNotFoundException {
Scanner sc = new Scanner(System.in);
System.out.println("Please enter the file name : ");
String fileName = sc.next();
File inputFile = new File(fileName);
Scanner fileInput = new Scanner(inputFile);
/** PART 1 STARTS HERE **/
/** Store all the words in the data structure ARRAYLIST
* This is because ArrayList can be used to store any number of values
* and we don't know beforehand the number of words in the file
* Hence, it is most suited for this job. Also it is easy to traverse an ArrayList
* and we will need traversals in the following parts, hence it is also useful**/
ArrayList<String> words = new ArrayList<String>();
//Variable to read words from the file
String word;
while(fileInput.hasNext()){
//Extract a word from the file
word = fileInput.next();
//Check if the length of the words is more than 3, then add it else ignore it
if(word.length() >= 3) words.add(word);
}
//Display the data structure ds1
System.out.println(words.toString());
/** PART 1 ENDS HERE **/
/** PART 2 STARTS HERE **/
/**We still don't know the number of words which would remain after removing duplicates.
* Hence, we will continue to use ArrayList.**/
ArrayList<String> duplicatesRemoved = new ArrayList<String>();
//HashMap object to store number of occurrences of each word in the string
HashMap<String,Integer> map = new HashMap<String,Integer>();
//Traverse through the ArrayList and remove duplicates and store into duplicatesRemoved
for(String str: words){
//If the map has no value mapped to this word
if(map.get(str) == null){
//Then it means this word has appeared for the first time
//So, map it to 1 for future reference
map.put(str, 1);
//and add this word to duplicatesRemoved
duplicatesRemoved.add(str);
}
//However, if the map is mapped to 1, the it means the word has already occurred
//So, ignore this word and move ahead. We don't need to do anything here
}
//Display the data structure ds2
System.out.println(duplicatesRemoved.toString());
/** PART 2 ENDS HERE**/
/** PART 3 STARTS HERE **/
/** Now we know the exact number of words. Hence we can use Array this time.
* Another motivation behind using Array is that it can be easily sorted
* using Arrays.sort() as done below. **/
String[] sortedWords = new String[duplicatesRemoved.size()];
//Copy the removedDuplicates into the Array
for(int i=0;i<sortedWords.length;i++){
sortedWords[i]=duplicatesRemoved.get(i);
}
//Sort the Array. Arrays.sort() sorts in the lexicographic order
Arrays.sort(sortedWords);
//Display the data structure ds3
for(int i=0;i<sortedWords.length;i++){
System.out.print(sortedWords[i] + " ");
}
/** PART 3 ENDS HERE **/
//Close the scanner objects
fileInput.close();
sc.close();
}
}
Related Questions
drjack9650@gmail.com
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.