Urgent answer needed. When an author produce an index for his or her book, the f
ID: 3849990 • Letter: U
Question
Urgent answer needed.
When an author produce an index for his or her book, the first step in this process is to decide which words should go into the index; the second is to produce a list of the pages where each word occurs. Instead of trying to choose words out of our heads, we decided to let the computer produce a list of all the unique words used in the manuscript and their frequency of occurrence. We could then go over the list and choose which words to put into the index. The main object in this problem is a "word" with associated frequency. The tentative definition of "word" here is a string of alphanumeric characters between markers where markers are white space and all punctuation marks; anything non-alphanumeric stops the reading. If we skip all un-allowed characters before getting the string, we should have exactly what we want. Ignoring words of fewer than three letters will remove from consideration such as "a", "is", "to", "do", and "by" that do not belong in an index. In this project, you are asked to write a program to read any text file and then list all the "words" in alphabetic order with their frequency together appeared in the article. The "word" is defined above and has at least three letters. Your result should be printed to an output file named YourUserID.txt. You need to create a Binary Search Tree (BST) to store all the word object by writing an insertion or increment function. Finally, a proper traversal print function of the BST should be able to output the required results. The BST class in the text can not be used directly to solve this problem. It is also NOT a good idea to modify the BST class to solve this problem. Instead, the following codes are recommended to start your program.
//Data stored in the node type struct WordCount { string word; int count; };
//Node type: struct TreeNode { WordCount info; TreeNode * left; TreeNode * right; };
// Two function's prototype
// Increments the frequency count if the string is in the tree
// or inserts the string if it is not there. void Insert(TreeNode*&, string);
// Prints the words in the tree and their frequency counts. void PrintTree(TreeNode* , ofstream&);
//Start your main function and the definitions of above two functions.
Sample Run
Please type the text file name: Lincoln.txt
Please give the output text file name: mus11.txt
You are done! You can open the file "mus11.txt" to check.
Press any key to continue
Download the following files: Lincoln.txt
Preview the documentView in a new window mus11.txt
Preview the documentView in a new window Upload your source and output files (upload - do NOT zip them! - your source file: proj9.cpp with your sample run pasted as comments at the end and your output file - userID.txt) for the assignment below.
Lincoln.txt
The Gettysburg Address
Gettysburg, Pennsylvania
November 19, 1863
Four score and seven years ago our fathers brought forth on this continent, a new nation, conceived in
Liberty, and dedicated to the proposition that all men are created equal.
Now we are engaged in a great civil war, testing whether that nation, or any nation so conceived and
so dedicated, can long endure. We are met on a great battle-field of that war. We have come to dedicate
a portion of that field, as a final resting place for those who here gave their lives that that nation
might live. It is altogether fitting and proper that we should do this.
But, in a larger sense, we can not dedicate -- we can not consecrate -- we can not hallow -- this ground.
The brave men, living and dead, who struggled here, have consecrated it, far above our poor power to add
or detract. The world will little note, nor long remember what we say here, but it can never forget what
they did here. It is for us the living, rather, to be dedicated here to the unfinished work which they
who fought here have thus far so nobly advanced. It is rather for us to be here dedicated to the great
task remaining before us -- that from these honored dead we take increased devotion to that cause for
which they gave the last full measure of devotion -- that we here highly resolve that these dead shall not
have died in vain -- that this nation, under God, shall have a new birth of freedom -- and that government
of the people, by the people, for the people, shall not perish from the earth.
mus11.txt
1863 1
Address 1
But 1
Four 1
Gettysburg 2
God 1
Liberty 1
November 1
Now 1
Pennsylvania 1
The 3
above 1
add 1
advanced 1
ago 1
all 1
altogether 1
and 6
any 1
are 3
battle-field 1
before 1
birth 1
brave 1
brought 1
but 1
can 5
cause 1
civil 1
come 1
conceived 2
consecrate 1
consecrated 1
continent 1
created 1
dead 3
dedicate 2
dedicated 4
detract 1
devotion 2
did 1
died 1
earth 1
endure 1
engaged 1
equal 1
far 2
fathers 1
field 1
final 1
fitting 1
for 5
forget 1
forth 1
fought 1
freedom 1
from 2
full 1
gave 2
government 1
great 3
ground 1
hallow 1
have 5
here 8
highly 1
honored 1
increased 1
larger 1
last 1
little 1
live 1
lives 1
living 2
long 2
measure 1
men 2
met 1
might 1
nation 5
never 1
new 2
nobly 1
nor 1
not 5
note 1
our 2
people 3
perish 1
place 1
poor 1
portion 1
power 1
proper 1
proposition 1
rather 2
remaining 1
remember 1
resolve 1
resting 1
say 1
score 1
sense 1
seven 1
shall 3
should 1
struggled 1
take 1
task 1
testing 1
that 13
the 9
their 1
these 2
they 3
this 4
those 1
thus 1
under 1
unfinished 1
vain 1
war 2
what 2
whether 1
which 2
who 3
will 1
work 1
world 1
years 1
Explanation / Answer
The code and output is given below. Please don't forget to rate the answer if it helped. Thank you very much.
BST.h
#ifndef BST_h
#define BST_h
#include <iostream>
using namespace std;
struct WordCount {
string word;
int count;
};
struct TreeNode {
WordCount info;
TreeNode * left;
TreeNode * right;
};
class BST
{
private:
TreeNode* root;
public:
BST()
{
root = NULL;
}
TreeNode*& getRoot()
{
return root;
}
void Insert(TreeNode* &n, string word)
{
if( n == NULL)
{
n = new TreeNode();;
n->info.word = word;
n->info.count = 1;
n->left = NULL;
n->right = NULL;
}
else if(n->info.word == word)
{
n->info.count++;
}
else if(word < n->info.word)
Insert(n->left, word);
else
Insert(n->right, word);
}
void PrintTree(TreeNode* n, ofstream& out)
{
if(n == NULL)
return;
PrintTree(n->left, out);
out << n->info.word << " " << n->info.count << endl;
PrintTree(n->right, out);
}
};
#endif /* BST_h */
main.cpp
#include <iostream>
#include <fstream>
#include <cctype>
#include "BST.h"
using namespace std;
//function to remove all punctuation marks... and retain only alphanumeric characters
string removeMarkers(string word)
{
string newWord="";
for(int i = 0, len = word.size(); i < len; i++)
{
if(isalnum(word[i]))
newWord += word[i];
}
return newWord;
}
int main() {
string infilename, outfilename;
cout << "Please type the text file name: " ;
cin >> infilename;
cout << "Please give the output text file name: ";
cin >> outfilename;
ifstream infile(infilename);
ofstream outfile(outfilename);
if(infile.fail())
{
cout << "Could not open input file " << infilename << endl;
exit(1);
}
string word;
BST bst;
while(infile >> word)
{
word = removeMarkers(word);
if(word.length() >= 3) // only if word lenght is 3 or more , insert into BST
{
bst.Insert(bst.getRoot(), word);
}
}
infile.close();
bst.PrintTree(bst.getRoot(), outfile);
outfile.close();
cout << "You are done! You can open the file " << outfilename << " to check" << endl;
}
outptu
Please type the text file name: Lincoln.txt
Please give the output text file name: mus11.txt
You are done! You can open the file mus11.txt to check
output file mus11.txt
1863 1
Address 1
But 1
Four 1
Gettysburg 2
God 1
Liberty 1
November 1
Now 1
Pennsylvania 1
The 3
above 1
add 1
advanced 1
ago 1
all 1
altogether 1
and 6
any 1
are 3
battlefield 1
before 1
birth 1
brave 1
brought 1
but 1
can 5
cause 1
civil 1
come 1
conceived 2
consecrate 1
consecrated 1
continent 1
created 1
dead 3
dedicate 2
dedicated 4
detract 1
devotion 2
did 1
died 1
earth 1
endure 1
engaged 1
equal 1
far 2
fathers 1
field 1
final 1
fitting 1
for 5
forget 1
forth 1
fought 1
freedom 1
from 2
full 1
gave 2
government 1
great 3
ground 1
hallow 1
have 5
here 8
highly 1
honored 1
increased 1
larger 1
last 1
little 1
live 1
lives 1
living 2
long 2
measure 1
men 2
met 1
might 1
nation 5
never 1
new 2
nobly 1
nor 1
not 5
note 1
our 2
people 3
perish 1
place 1
poor 1
portion 1
power 1
proper 1
proposition 1
rather 2
remaining 1
remember 1
resolve 1
resting 1
say 1
score 1
sense 1
seven 1
shall 3
should 1
struggled 1
take 1
task 1
testing 1
that 13
the 9
their 1
these 2
they 3
this 4
those 1
thus 1
under 1
unfinished 1
vain 1
war 2
what 2
whether 1
which 2
who 3
will 1
work 1
world 1
years 1
Related Questions
drjack9650@gmail.com
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.