Overview: The goal of this assignment is to implement a simple spell checker usi
ID: 3766523 • Letter: O
Question
Overview: The goal of this assignment is to implement a simple spell checker using a hash table. You will be given the basic guidelines for your implementation, but other than that you are free to determine and implement the exact classes and methods that you might need. Your spell-checker will be reading from two input files. The first file is a dictionary containing one word per line. The program should read the dictionary and insert the words into a hash table. After reading the dictionary, it will read a list of words from a second file. The goal of the spellchecker is to determine the misspelled words in the second file by looking each word up in the dictionary. You should output each misspelled word, the line numbers in which it occurs, and a list ofpossible corrections. Possible Corrections: If the word you are looking for is not in the dictionary, assume that it is misspelled. To suggest corrections, for each misspelled word, list any words in the dictionary that are obtainable by any of the following rules. • Change one letter: For example, if the misspelled word is “kest”, try all possibilities of changing one character at a time, and look the modified word up in the dictionary. The possibilities will be “aest”, “best”,...,”zest”, “kast”,...,”kzst”, etc. • Exchange adjacent letters: For example, if the misspelled word is “ebst”, try “best”, esbt” and “ebts”. • Remove one letter: For example, if the misspelled word is “tbird”, try all possibilities of removing one letter at a time, and look the modified word up in the dictionary, which are: “bird”, “tird”, “tbrd”, and “tbir”. Note that, you have to try all possibilities, but only the ones that are actually foundin the dictionary are possible corrections to be suggested. Details: • The Hash Function: As your hash code, you may use the hashCode() method from the String class. • Collisions: To handle collisions,useseparatechaining. • Statistics: Your program should keep track of the following: The number of collisions encountered (during insertions), Length of an average chain encountered during an insertion/lookup, length of the maximum chain encountered during an insertion/look-up, and the load factor. 2 • Hash Table Expansion: Initially, you can set an appropriate hash table size but, note that you might have to expand your hash table to keep the load factor low enough. Definitely keep the load factor below 0.9. • Assume that the words in the dictionary file are given in lower case. So, before looking up a word in the dictionary, convert the word to lower case. • Output Format: For each misspelled word, generate a single output line as follows. First print the word itself, then comma, then the line number it occurs, then colon, and finally the list of possible corrections separated by commas. For example, if the misspelled word is “kest”, the output may be: kest, 12: best, nest, test, vest The final lines of the output should print the hash table statistics in the following order. no_collisions: average_chain_length: max_chain_length: load_factor: • Input: The dictionary file has to be named as “dictionary.txt”. You can test your program with a small dictionary of your own. I will provide you with a larger dictionary. This file will consist of one word perline, and each word could be of arbitrary length. The second file which contains the words to be spell-checked has to be named as “input.txt”. Its format will be the same, that is, one word per line, and each word could be of arbitrary length. Deliverables: Your submission will consist of an encapsulation of files which you will email to me and grader. Please read the following instructions carefully, since significant deviations can result in the loss of points. The encapsulation must include the following items: 1. README: This file should include: Yourname. Results. (VERY IMPORTANT) Describe the details of your design and implementation. Describe the purpose of each class. Report yourfindings on your experimentation with the hash function and the hash table size. Were you able to achieve reasonable collision rates/chain lengths? Known Bugs and Limitations: List any known bugs or limitations with respect to the program specifications. 2. Source Files: All the source files. To submit, store everything (README and all source files) in a directory whose name is your lastname (or something easily identifiable) e.g. smith. Before submitting, zip that directory, and send the resulting bundle as an email attachment to us.
Explanation / Answer
Answer :
import java.io.*;
import java.util.*;
import java.util.regex.*;
public class spellcheckercode {
Hashtable<String,String> dictionary;
boolean suggestWord ;
public static void main(String [] args)
{
spellcheckercode checker = new spellcheckercode();
}
public spellcheckercode()
{
dictionary = new Hashtable<String,String>();
System.out.println("Welcome to the spell checker using Hashtable");
System.out.println("The spell checker checks every line from the input file and then give
further steps if needed after each line. ");
try
{
BufferedReader dictReader = new BufferedReader(new FileReader("dictionary.txt"));
while (dictReader.ready())
{
String dictInput = dictReader.readLine() ;
String [] dict = dictInput.split("\s");
for(int i = 0; i < dict.length;i++)
{
dictionary.put(dict[i], dict[i]);
}
}
dictReader.close();
String file = "input.txt";
BufferedReader inputFile = new BufferedReader(new FileReader(file));
System.out.println("Reading from "+file);
spellingchild suggest = new spellingchild("words.txt");
while ( inputFile.ready() )
{
String s = inputFile.readLine() ;
System.out.println (s);
String[] result = s.split("\s");
for (int x=0; x<result.length; x++)
{
suggestWord = true;
String outputWord = checkWord(result[x]);
if(suggestWord)
{
System.out.println("Suggestions for "+result[x]+" are: "+suggest.correct
(outputWord)+" ");
}
}
}
inputFile.close();
}
catch (IOException e)
{
System.out.println("IOException Occured! ");
e.printStackTrace();
}
}
public String checkWord(String wordToCheck)
{
String wordCheck, unpunctWord;
String word = wordToCheck.toLowerCase();
if ((wordCheck = (String)dictionary.get(word)) != null)
{
suggestWord = false;
return wordCheck;
}
int length = word.length();
if (length > 1 && word.substring(0,1).equals("""))
{
unpunctWord = word.substring(1, length);
if ((wordCheck = (String)dictionary.get(unpunctWord)) != null)
{
suggestWord = false;
return wordCheck ;
}
else
return unpunctWord;
}
if( word.substring(length - 1).equals(".") || word.substring(length - 1).equals(",") ||
word.substring(length - 1).equals("!")
|| word.substring(length - 1).equals(";") || word.substring(length - 1).equals(":"))
{
unpunctWord = word.substring(0, length-1);
if ((wordCheck = (String)dictionary.get(unpunctWord)) != null)
{
suggestWord = false;
return wordCheck ;
}
else
{
return unpunctWord;
}
}
if (length > 2 && word.substring(length-2).equals(","") || word.substring(length-
2).equals("."")
|| word.substring(length-2).equals("?"") || word.substring(length-2).equals("!"") )
{
unpunctWord = word.substring(0, length-2);
if ((wordCheck = (String)dictionary.get(unpunctWord)) != null)
{
suggestWord = false;
return wordCheck ;
}
else
return unpunctWord;
}
return word;
}
}
class spellingchild {
private final HashMap<String, Integer> DBWords = new HashMap<String, Integer>();
public spellingsuggest(String file) throws IOException
{
try
{
BufferedReader in = new BufferedReader(new FileReader(file));
Pattern p = Pattern.compile("\w+");
for(String temp = ""; temp != null; temp = in.readLine() )
{
Matcher m = p.matcher(temp.toLowerCase());
while(m.find())
{
DBWords.put( (temp = m.group()), DBWords.containsKey
(temp) ? DBWords.get(temp) + 1 : 1 );
}
}
in.close();
}
catch(IOException e)
{
System.out.println("Uh-Oh Exception occured!");
e.printStackTrace();
}
}
private final ArrayList<String> edits(String word)
{
ArrayList<String> result = new ArrayList<String>();
for(int i=0; i < word.length(); ++i)
{
result.add(word.substring(0, i) + word.substring(i+1));
}
for(int i=0; i < word.length()-1; ++i)
{
result.add(word.substring(0, i) + word.substring(i+1, i+2) +
word.substring(i, i+1) + word.substring(i+2));
}
for(int i=0; i < word.length(); ++i)
{
for(char c='a'; c <= 'z'; ++c)
{
result.add(word.substring(0, i) + String.valueOf(c) + word.substring
(i+1));
}
}
for(int i=0; i <= word.length(); ++i)
{
for(char c='a'; c <= 'z'; ++c)
{
result.add(word.substring(0, i) + String.valueOf(c) + word.substring
(i));
}
}
return result;
}
public final String correct(String word)
{
if(DBWords.containsKey(word))
{
return word;
}
ArrayList<String> list_edits = edits(word);
HashMap<Integer, String> candidates = new HashMap<Integer, String>();
for(String s : list_edits)
{
if(DBWords.containsKey(s))
{
candidates.put(DBWords.get(s),s);
}
}
if(candidates.size() > 0)
{
return candidates.get(Collections.max(candidates.keySet()));
}
for(String s : list_edits)
{
for(String w : edits(s))
{
if(DBWords.containsKey(w))
{
candidates.put(DBWords.get(w),w);
}
}
}
return candidates.size() > 0 ? candidates.get(Collections.max
(candidates.keySet())) : "Sorry but no possible corrections found!";
}
public static void main(String [] args) throws IOException
{
if(args.length > 0)
{
System.out.println((new spellingsuggest("words.txt")).correct(args[0]));
}
}
}
Related Questions
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.