Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

hamlet.txt url: http://www.buildingjavaprograms.com/code_files/3ed/ch06/hamlet.t

ID: 3768958 • Letter: H

Question

hamlet.txt url: http://www.buildingjavaprograms.com/code_files/3ed/ch06/hamlet.txt

words.txt url: http://www.buildingjavaprograms.com/code_files/3ed/ch13/words.txt

Write a program to determine and report on word length frequencies for hamlet.txt and words.txt.  Word lengths are the number of characters each word takes.  For example, "cat" has length 3, "bobsled" has length 7.  By frequency I mean how many times words of a specific length appear.  How many words of length 2 appear in hamlet?  How many words of length 5 appear in words.txt?  How many different words of length 4 appear in hamlet?  Your program should determine all the word length frequencies (for each word length that exists in words, hamlet, and the list of distinct words in hamlet, determine how many times it appears in each of these).  It should also determine the average word length based on these frequencies for hamlet, the list of distinct words in hamlet, and the list of words in words.txt.  This program should be called WordLengthFrequencies.java, and it should print all its results to the monitor.

You will build the list of words in words.txt, the list of all words in hamlet.txt, and the list of distinct words in hamlet.txt, as you did in prior parts of the assignment (as ArrayLists).  You can approach the problem in several different ways, how you do it is up to you.  One way would be to make separate ArrayLists of Integer where the index in the ArrayList is the word length and the value at that index is the frequency.  Another would be to make a separate class called LengthFrequencies, or the like, that contains 2 int fields: length and frequency, and make an ArrayList.  Another way would be to make a compareTo method for a new class that sorts based on the length of its String field and then put all the words into instances of this class and make an ArrayLIst of these, sort them, and use teh list sorted by word length.  No matter how you do it, the output should look as follows (as an example):

Word length frequencies for words.txt:

Average word length: 2.67

Words of length 1: 1

Words of length 2: 938

Words of length 3: 2745

Words of length 4: ...

...

...

...

Word length frequencies for all words in hamlet.txt:

Average word length: 3.2

Words of length 1: 234

Words of length 2: 494378

Words of length 3: 4937

Words of length 4: ...

...

...

...

Word length frequencies for distinct words in hamlet.txt:

Average word length: 4.321

Words of length 1: 1

Words of length 2:  3445

Words of length 3: 24242

Words of length 4: ...

...

...

...

Note that the average word length for each list is derived by taking (1 * the number of words of length 1) + (2 * the number of words of length 2) + (3 * the number of words of length 3) + ... and finally once you have this sum, dividing it by the total number of words in the list.

Explanation / Answer

Complete Program:


// File: WordLengthFrequencies.java
import java.io.*;
import java.util.*;
public class WordLengthFrequencies
{
public static void main(String[] args) throws FileNotFoundException
{
  ArrayList<LengthFrequencies> list1 = new ArrayList<LengthFrequencies>();
  ArrayList<LengthFrequencies> list2 = new ArrayList<LengthFrequencies>();
  ArrayList<LengthFrequencies> list3 = new ArrayList<LengthFrequencies>();
  ArrayList<String> list4 = new ArrayList<String>();
  
  Scanner infile1 = new Scanner(new File("words.txt"));
  Scanner infile2 = new Scanner(new File("hamlet.txt"));
  String word = "";
      
  while(infile1.hasNext())
  {
   word = infile1.next();   
   
   addToList(list1, word);
  }
    
  while(infile2.hasNext())
  {
   word = infile2.next();
   
   addToList(list2, word);
   
   if(!list4.contains(word))
   {
    list4.add(word);
    
    addToList(list3, word);
   }
  }
  
  Collections.sort(list1);
  Collections.sort(list2);
  Collections.sort(list3);

  System.out.println("Word length frequencies for words.txt:");
  System.out.println("Average word length: " + getAverageWordLength(list1));
  printList(list1);
  
  System.out.println(" Word length frequencies for all words in hamlet.txt:");
  System.out.println("Average word length: " + getAverageWordLength(list2));
  printList(list3);
  
  System.out.println(" Word length frequencies for distinct words in hamlet.txt:");
  System.out.println("Average word length: " + getAverageWordLength(list3));
  printList(list3);  
}

public static double getAverageWordLength(ArrayList<LengthFrequencies> list)
{
  double total = 0;
  
  for(int i = 0; i < list.size(); i++)
   total += (list.get(i).getLength() * list.get(i).getFrequency());
  
  if(list.size() > 0)
   return total / list.size();
  else
   return 0;
}

public static void printList(ArrayList<LengthFrequencies> list)
{
  for(int i = 0; i < list.size(); i++)
   System.out.println(list.get(i));
  System.out.println();  
}

public static void addToList(ArrayList<LengthFrequencies> list, String word)
{
  boolean found = false;
  int index = 0;
  
  for(int i = 0; i < list.size() && !found; i++)
  {
   if(list.get(i).getLength() == word.length())
   {
    found = true;
    index = i;
   }
  }
  
  if(found)
  {
   list.get(index).setFrequency(list.get(index).getFrequency() + 1);
  }
  else
  {
   list.add(new LengthFrequencies(word.length(), 1));
  }
}
}