he objective of this assignment is to implement a spell checking program using h
ID: 3722762 • Letter: H
Question
he objective of this assignment is to implement a spell checking program using hash tables and separate chaining using JAVA.
Your Program General Overview:Most spell checkers use hashing to determine if a given word in a document is correct (that’s why they can check your spelling as you type).
The Dictionary: First, you must set up a dictionary in which to look up words. When your program starts, use either JFileChooser or JOptionPane.showInputDialog or a Scanner object to allow the user to specify the dictionary file to load. Read in the dictionary file. There is one word per line. Insert each word in the dictionary into a Table ADT.
Implement the Table ADT as follows:
• Use a hash table of size 94321 (a prime number that gives a load factor of about 0.5).
• Use separate chaining (in which each table slot is a (singly) linked list of nodes) to resolve collisions. Node words may be inserted at the front of the appropriate linked list.
• Use the polynomial hashing algorithm discussed in class to hash each word — use Horner’s method to compute the polynomial hash code and take the result of each operation mod the table size to avoid integer overflow. In Horner’s method, use a = 13 when evaluating your polynomial p(a).
Checking the spelling in a file: After you have read in the dictionary file and stored them in a table, you must check the spelling of all words in another file, a document file.
Again, use either JFileChooser or JOptionPane.showInputDialog or a Scanner object to allow the user to specify the document file to load. The file book.txt is provided for testing. This file contains multiple words per line. You are encourage to test against other document files.
For each word in the document file, perform a Table search to see if the word is in the dictionary. If a word isn’t present in the Table, then it is a spelling error (that is, a mis-spelled word). It should be pointed out that words.txt is not a very complete dictionary and so some words that are spelled correctly will be flagged as erroneous.
Once your program has process the document file, indicate the spelling errors by printing each mis-spelled word once. With each word, include the line numbers on which the word appears (in the order in which you found them). Note that a word can be mis-spelled multiple times on a single line. So an mis-spelled word might generate the following line in the output:
Invalid word "Michiko" found on lines 1 59 21 21
To keep track of mis-spelled words and on which lines they occur, use another hash table, known as the mis-spelled word table, to keep track of the mis-spelled words (and where they were found in the document):
• This hash table should have size 2797.
• An mis-spelled word will only be added to the mis-spelled word table if the word is not already in the mis-spelled word table. Thus, you must do a search for the word, and only add the word to the mis-spelled word table if it is not found.
• Each element of the mis-spelled word table will contain two things: a word and a queue of the line numbers on which this mis-spelled word was found in the document file. The queue of line numbers is an ordinary queue: line numbers are enqueued at the end of the queue. You may implement this as a linked list (of your choice).
• If a mis-spelled word is already in the table, simply add the new line number to the end of queue of line numbers for that word.
After the document file is completely processed, traverse the mis-spelled word table. For each mis-spelled word found in the table, print out both the word and the numbers of all lines on which it appeared (removing the line numbers one-by-one from the line number queue). Note that the output will likely be in alphabetic order, which is perfectly okay.
Suggestions
• The document file that you will read has punctuation that must be handled. You must ensure that no word is seen as error because it had punctuation attached to it. Note that an apostrophe is a valid part of a word and should not be considered punctuation. Punctuations other than apostrophe will split a word into two. Eg) Input abc-def becomes two words abc and def.
Suggestion: If String inLine is an input line from the document file, then the command nLine.trim().split( "[^a-zA-Z']+" ) splits the line into tokens using one or more characters that are NOT letters or an apostrophe as the splitting pattern (“^” means “not”). Try this on some sample strings to see what it does.
• Hashing "A" and "a" produces different values, but you would not expect them to be different words in a dictionary. To manage this difference, convert all text to all lower case when processing (this comment also applies to the dictionary words). This conversion simplifies things and makes the spell checker more reliable. Suggestion: The helpful String method toLowerCase() will do the job for you and a good idea to do this as soon as an input is read in.
• You must organize you program into appropriate classes. You should have a class representing your hash table, a class representing a the data to be stored in the hash table, a class representing a queue, a class representing the data to be stored in your queue, and a main application class. You may have other classes if you need them.
• Do not have more than one hash table class. There should be one hash table class, that can be used for both the dictionary words and the mis-spelled words (of course you’ll need two hash table objects). The key is to make the class that holds your hash table data data flexible enough to handle both situations.
• Your data structures (hash tables, queues, etc.) should NOT contain any program logic specific for solving this program. Instead, they just should implement the standard features as discussed in class.
• Test your program using a small dictionary and document file first. Using small files along with small hash table sizes (say size=5) will allow you to quickly check if your code is working or not. A small dictionary file test-dict.txt is provided, along with a small document file test-file.txt.
Sample output:
Your output should take the following form (note that this is not the output for the given data files):
There are a total of 10 invalid words:
Invalid word "cant" found on lines 1 Invalid "pewter" found on lines 1
Invalid word "turbine" found on lines 1
Invalid word "Michiko" found on lines 1 59 21 21
Invalid word "fortune" found on lines 1
Invalid word "fashionable" found on lines 3 5 9 11 11 13
Invalid word "Popcorn" found on lines 5 90 104
Invalid word "10" found on lines 7 139
Invalid word "sensei" found on lines 7 139
Invalid word "haberdasher" found on lines 9 13 13
the dictionary file
book.txt
Explanation / Answer
spell checker program
public class SpellChecker {
private static String stringInput; // input to check;
private static String[] checkThis; // the stringInput turned array of words to check.
public static HashSet dictionary; // the dictionary used
/**
* Main method.
* @param args Argh!
*/
public static void main(String[] args) {
setup();
}//end of main
/**
* This method loads the dictionary and initiates the checks for errors in a scanned input.
*/
public static void setup(){
int tableSIZE=59000;
dictionary = new HashSet(tableSIZE);
try {
//System.out.print(System.getProperty("user.dir"));//just to find user's working directory;
// I combined FileReader into the BufferReader statement
//the file is located in edu.frostburg.cosc310
BufferedReader bufferedReader = new BufferedReader(new FileReader("./dictionary.txt"));
String line = null; // notes one line at a time
while((line = bufferedReader.readLine()) != null) {
dictionary.add(line);//add dictinary word in
}
prompt();
bufferedReader.close(); //close file
}
catch(FileNotFoundException ex) {
ex.printStackTrace();//print error
}
catch(IOException ex) {
ex.printStackTrace();//print error
}
}//end of setUp
/**
* Just a prompt for auto generated tests or manual input test.
*/
public static void prompt(){
System.out.println("Type a number from below: ");
System.out.println("1. Auto Generate Test 2.Manual Input 3.Exit");
Scanner theLine = new Scanner(System.in);
int choice = theLine.nextInt(); // for manual input
if(choice==1) autoTest();
else if(choice==2) startwInput();
else if (choice==3) System.exit(0);
else System.out.println("Invalid Input. Exiting.");
}
/**
* Manual input of sentence or words.
*/
public static void startwInput(){
//printDictionary(bufferedReader); // print dictionary
System.out.println("Spell Checker by C. Austria Please enter text to check: ");
Scanner theLine = new Scanner(System.in);
stringInput = theLine.nextLine(); // for manual input
System.out.print(" You have entered this text: "+stringInput+" Initiating Check...");
/*------------------------------------------------------------------------------------------------------------*/
//final long startTime = System.currentTimeMillis(); //speed test
WordFinder grammarNazi = new WordFinder(); //instance of MisSpell
splitString(removePunctuation(stringInput));//turn String line to String[]
grammarNazi.initialCheck(checkThis);
//final long endTime = System.currentTimeMillis();
//System.out.println("Total execution time: " + (endTime - startTime) );
}//end of startwInput
/**
* Generates a testing case.
*/
public static void autoTest(){
System.out.println("Spell Checker by C. Austria This sentence is being tested: The dog foud my hom. And m ct hisse xdgfchv!@# ");
WordFinder grammarNazi = new WordFinder(); //instance of MisSpell
splitString(removePunctuation("The dog foud my hom. And m ct hisse xdgfchv!@# "));//turn String line to String[]
grammarNazi.initialCheck(checkThis);
}//end of autoTest
/**
* This method prints the entire dictionary.
* Was used in testing.
* @param bufferedReader the dictionary file
*/
public static void printDictionary(BufferedReader bufferedReader){
String line = null; // notes one line at a time
try{
while((line = bufferedReader.readLine()) != null) {
System.out.println(line);
}
}catch(FileNotFoundException ex) {
ex.printStackTrace();//print error
}
catch(IOException ex) {
ex.printStackTrace();//print error
}
}//end of printDictionary
/**
* This methods splits the passed String and puts them into a String[]
* @param sentence The sentence that needs editing.
*/
public static void splitString(String sentence){
// split the sentence in between " " aka spaces
checkThis = sentence.split(" ");
}//end of splitString
/**
* This method removes the punctuation and capitalization from a string.
* @param sentence The sentence that needs editing.
* @return the edited sentence.
*/
public static String removePunctuation(String sentence){
String newSentence; // the new sentence
//remove evil punctuation and convert the whole line to lowercase
newSentence = sentence.toLowerCase().replaceAll("[^a-zA-Z\s]", "").replaceAll("\s+", " ");
return newSentence;
}//end of removePunctuation
}
This class checks for misspellings
public class WordFinder extends SpellChecker{
private int wordsLength;//length of String[] to check
private List<String> wrongWords = new ArrayList<String>();//stores incorrect words
/**
* This methods checks the String[] for spelling errors.
* Hashes each index in the String[] to see if it is in the dictionary HashSet
* @param words String list of misspelled words to check
*/
public void initialCheck(String[] words){
wordsLength=words.length;
System.out.println();
for(int i=0;i<wordsLength;i++){
//System.out.println("What I'm checking: "+words[i]); //test only
if(!dictionary.contains(words[i])) wrongWords.add(words[i]);
} //end for
//manualWordLookup(); //for testing dictionary only
if (!wrongWords.isEmpty()) {
System.out.println("Mistakes have been made!");
printIncorrect();
} //end if
if (wrongWords.isEmpty()) {
System.out.println(" Move along. End of Program.");
} //end if
}//end of initialCheck
/**
* This method that prints the incorrect words in a String[] being checked and generates suggestions.
*/
public void printIncorrect(){//delete this guy
System.out.print("These words [ ");
for (String wrongWord : wrongWords) {
System.out.print(wrongWord + " ");
}//end of for
System.out.println("]seems incorrect. ");
suggest();
}//end of printIncorrect
/**
* This method gives suggestions to the user based on the wrong words she/he misspelled.
*/
public void suggest(){
MisSpell test = new MisSpell();
while(!wrongWords.isEmpty()&&test.possibilities.size()<=5){
String wordCheck=wrongWords.remove(0);
test.generateMispellings(wordCheck);
//if the possibilities size is greater than 0 then print suggestions
if(test.possibilities.size()>=0) test.print(test.possibilities);
}//end of while
}//end of suggest
/*ENTERING TEST ZONE*/
/**
* This allows a tester to look thorough the dictionary for words if they are valid; and for testing only.
*/
public void manualWordLookup(){
System.out.print("Enter 'ext' to exit. ");
Scanner line = new Scanner(System.in);
String look=line.nextLine();
do{
if(dictionary.contains(look)) System.out.print(look+" is valid ");
else System.out.print(look+" is invalid ");
look=line.nextLine();
}while (!look.equals("ext"));
}//end of manualWordLookup
}
/**
* This is the main class responsible for generating misspellings.
* @author Catherine Austria
*/
public class MisSpell extends SpellChecker{
public List<String> possibilities = new ArrayList<String>();//stores possible suggestions
private List<String> tempHolder = new ArrayList<String>(); //telps for the transposition method
private int Ldistance=0; // the distance related to the two words
private String wrongWord;// the original wrong word.
/**
* Execute methods that make misspellings.
* @param wordCheck the word being checked.
*/
public void generateMispellings(String wordCheck){
wrongWord=wordCheck;
try{
concatFL(wordCheck);
concatLL(wordCheck);
replaceFL(wordCheck);
replaceLL(wordCheck);
deleteFL(wordCheck);
deleteLL(wordCheck);
pluralize(wordCheck);
transposition(wordCheck);
}catch(StringIndexOutOfBoundsException e){
System.out.println();
}catch(ArrayIndexOutOfBoundsException e){
System.out.println();
}
}
/**
* This method concats the word behind each of the alphabet letters and checks if it is in the dictionary.
* FL for first letter
* @param word the word being manipulated.
*/
public void concatFL(String word){
char cur; // current character
String tempWord=""; // stores temp made up word
for(int i=97;i<123;i++){
cur=(char)i;//assign ASCII from index i value
tempWord+=cur;
//if the word is in the dictionary then add it to the possibilities list
tempWord=tempWord.concat(word); //add passed String to end of tempWord
checkDict(tempWord); //check to see if in dictionary
tempWord="";//reset temp word to contain nothing
}//end of for
}//end of concatFL
/**
* This concatenates the alphabet letters behind each of the word and checks if it is in the dictionary. LL for last letter.
* @param word the word being manipulated.
*/
public void concatLL(String word){
char cur; // current character
String tempWord=""; // stores temp made up word
for(int i=123;i>97;i--){
cur=(char)i;//assign ASCII from index i value
tempWord=tempWord.concat(word); //add passed String to end of tempWord
tempWord+=cur;
//if the word is in the dictionary then add it to the possibilities list
checkDict(tempWord);
tempWord="";//reset temp word to contain nothing
}//end of for
}//end of concatLL
/**
* This method replaces the first letter (FL) of a word with alphabet letters.
* @param word the word being manipulated.
*/
public void replaceFL(String word){
char cur; // current character
String tempWord=""; // stores temp made up word
for(int i=97;i<123;i++){
cur=(char)i;//assign ASCII from index i value
tempWord=cur+word.substring(1,word.length()); //add the ascii of i ad the substring of the word from index 1 till the word's last index
checkDict(tempWord);
tempWord="";//reset temp word to contain nothing
}//end of for
}//end of replaceFL
/**
* This method replaces the last letter (LL) of a word with alphabet letters
* @param word the word being manipulated.
*/
public void replaceLL(String word){
char cur; // current character
String tempWord=""; // stores temp made up word
for(int i=97;i<123;i++){
cur=(char)i;//assign ASCII from index i value
tempWord=word.substring(0,word.length()-1)+cur; //add the ascii of i ad the substring of the word from index 1 till the word's last index
checkDict(tempWord);
tempWord="";//reset temp word to contain nothing
}//end of for
}//end of replaceLL
/**
* This deletes first letter and sees if it is in dictionary
* @param word the word being manipulated.
*/
public void deleteFL(String word){
String tempWord=word.substring(1,word.length()-1); // stores temp made up word
checkDict(tempWord);
//print(possibilities);
}//end of deleteFL
/**
* This deletes last letter and sees if it is in dictionary
* @param word the word being manipulated.
*/
public void deleteLL(String word){
String tempWord=word.substring(0,word.length()-1); // stores temp made up word
checkDict(tempWord);
//print(possibilities);
}//end of deleteLL
/**
* This method pluralizes a word input
* @param word the word being manipulated.
*/
public void pluralize(String word){
String tempWord=word+"s";
checkDict(tempWord);
}//end of pluralize
/**
* It's purpose is to check a word if it is in the dictionary.
* If it is, then add it to the possibilities list.
* @param word the word being checked.
*/
public void checkDict(String word){
if(dictionary.contains(word)){//check to see if tempWord is in dictionary
//if the tempWord IS in the dictionary, then check if it is in the possibilities list
//then if tempWord IS NOT in the list, then add tempWord to list
if(!possibilities.contains(word)) possibilities.add(word);
}
}//end of checkDict
/**
* This method transposes letters of a word into different places.
* Not the best implementation. This guy was my last minute addition.
* @param word the word being manipulated.
*/
public void transposition(String word){
wrongWord=word;
int wordLen=word.length();
String[] mixer = new String[wordLen]; //String[] length of the passed word
//make word into String[]
for(int i=0;i<wordLen;i++){
mixer [i]=word.substring(i,i+1);
}
shift(mixer);
}//end of transposition
/**
* This method takes a string[] list then shifts the value in between
* the elements in the list and checks if in dictionary, adds if so.
* I agree that this is probably the brute force implementation.
* @param mixer the String array being shifted around.
*/
public void shift(String[] mixer){
System.out.println();
String wordValue="";
for(int i=0;i<=tempHolder.size();i++){
resetHelper(tempHolder);//reset the helper
transposeHelper(mixer);//fill tempHolder
String wordFirstValue=tempHolder.remove(i);//remove value at index in tempHolder
for(int j=0;j<tempHolder.size();j++){
int inttemp=0;
String temp;
while(inttemp<j){
temp=tempHolder.remove(inttemp);
tempHolder.add(temp);
wordValue+=wordFirstValue+printWord(tempHolder);
inttemp++;
if(dictionary.contains(wordValue)) if(!possibilities.contains(wordValue)) possibilities.add(wordValue);
wordValue="";
}//end of while
}//end of for
}//end for
}//end of shift
/**
* This method fills a list tempHolder with contents from String[]
* @param wordMix the String array being shifted around.
*/
public void transposeHelper(String[] wordMix){
for(int i=0;i<wordMix.length;i++){
tempHolder.add(wordMix[i]);
}
}//end of transposeHelper
/**
* This resets a list
* @param thisList removes the content of a list
*/
public void resetHelper(List<String> thisList){
while(!thisList.isEmpty()) thisList.remove(0); //while list is not empty, remove first value
}//end of resetHelper
/**
* This method prints out a list
* @param listPrint the list to print out.
*/
public void print(List<String> listPrint){
if (possibilities.isEmpty()) {
System.out.print("Can't seem to find any related words for "+wrongWord);
return;
}
System.out.println("Maybe you meant these for "+wrongWord+": ");
System.out.printf("%s", listPrint);
resetHelper(possibilities);
}//end of print
/**
* This returns a String word version of a list
* @param listPrint the list to make into a word.
* @return the generated word version of a list.
*/
public String printWord(List<String> listPrint){
Object[] suggests = listPrint.toArray();
String theWord="";
for(Object word: suggests){//form listPrint elements into a word
theWord+=word;
}
return theWord;
}//end of printWord
}
Related Questions
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.