Aim: This assignment is to familiarise you with the use of STL container classes
ID: 3713879 • Letter: A
Question
Aim:
This assignment is to familiarise you with the use of STL container classes in your programs.
On completion you should know how to:
• Write C++ code using STL container class objects.
• Devise programs requiring data manipulation with STL containers.
• Gain experience writing and debugging complex C++ programs incrementally.
Prerequisites:
Before you undertake this assignment please review the week 8 lecture notes on the Standard
Template Library (STL). Also, download and study the week 7&8 STL example programs in the
Examples folder. The following links may also prove useful for learning STL containers:
http://www.cplusplus.com/reference/stl/
http://www.cprogramming.com/tutorial/stl/stlintro.html
http://www.sgi.com/tech/stl/stl_introduction.html
If you are unsure on how to do something with STL, try looking at the examples or google “STL”
together with “set” or “map” or “multi-map” etc.
Requirements:
For this assignment you are to implement a C++ program that can read a text file and display any known words
on the screen together with their frequency and position in the file. Below shows a snippet of what the output
should look like:
Word Count Position(s)
a 7 22 65 87 96 130 148 165
all 1 50
and 9 19 70 125 134 138 157 177 247 261
are 1 74
as 2 169 214
automata 1 73
be 1 99
begin 1 106
between 1 132
cell 5 97 131 156 180 238
cellular 1 72
create 1 143
cursor 2 119 124
eight 1 171
every 2 155 237
first 2 20 110
A “known” word is a word that is in the English dictionary which is defined by the words listed in the file named
“dictionary.txt”. The word “Positions” are determined by all the places where each word occurs in the data file.
For example, in the above screen output, the word “first” occurred twice, at the 20th and 110th word position in
the test data file.
A partially completed “WordStat” class is provided in “wordstat.h” and “wordstat.cpp”. A driver program is also
provided in main.cpp. The file: “output.txt” shows the output your completed program should produce when run
with redirected input from “input.txt”. You should do this assignment incrementally by completing the following
steps.
2
Step 1 (2 mark)
Implement the ReadDictionary() and DisplayDictionary() public member function in the
“wordstats” class in “wordstat.h & “wordstat.cpp”. ReadDictionary should open “dictionary.txt”
and read all the words (in lower case format) into the private “Dictionary” member:
set Dictionary;
If you are unsure on using the STL set container, take a look at “08-set.cpp” in the examples
folder. DisplayDictionary() should display the first 20 words in the Dictionary on the screen. The
screenshot below shows an example output of reading and displaying the dictionary.
Step-1 Reading and displaying dictionary
25127 words read from dictionary.
Displaying the first 20 words in the dictionary...
aarhus
aaron
ababa
aback
abacus
...
abbot
abbott
abbreviate
abc
abdicate
abdomen
Step 2 (2 mark)
Implement the ReadTextFile() public member function. This function should read the contents of
the file named: “testdata.txt” into the “KnownWords” and “UnknownWords” data members of the
“WordStats” class:
WordMap KnownWords;
WordMap UnknownWords;
Please note that “WordMap” is typedefed in “wordstat.h” as:
typedef map > WordMap;
Only “known” words should be put in the “KnownWords” WordMap. A “known” word is any
word that is found in the Dictionary class member. Before attempting to find a word in the
Dictionary, you should first preprocess the word by converting all characters to lower case and
remove any non-alphabetic characters except for the punctuation marks: hyphen (-) and apostrophe
(‘). You can add additional private member functions to class WordStats if you wish. Your output
from step-2 should look something like this:
Step-2 Reading known wrods from text file:
89 known words read.
49 unknown words read.
If you are unsure on working with map containers, an example is provided in “08-map.cpp” in the
example folder.
3
Step 3 (1 mark)
Implement the DisplayKnownWordStats(); public member function. This function should iterate
the “KnownWords” WordMap and display the word stats on the screen as shown on page 1. If
you are unsure on how to do this with a map look at “08-map.cpp” in the example folder.
Step 4 (1 mark)
Implement the “DisplayUnknownWordStats()” public member function. This function should
display the unknown words in the same format as step-3. Note: try to avoid duplicating code for
this by declaring an additional private member function that is passed a WordMap by reference
and called by both the display functions from step-3 and step-4.
Step 5 (1 marks)
Implement the “DisplayMostFreqKnownWords()” public member function. This function should
display the 10 most frequently occurring words in the “KnownWords” container. E.g.
Step-5 Displaying most frequent known words:
Word Count
the 21
live 10
of 9
and 9
or 7
in 7
a 7
to 6
is 6
cell 5
To do this declare a local multimap; container. You will need to iterate the
“KnownWords” container and insert the size of the vector (as the key) and the word into the local
multimap. This multimap can then be iterated to display the 10 most frequent words on the screen.
Step 6 (1 mark)
Implement the “DisplayMostFreqUnknownWords()” public member function to display the 10
most frequently occurring words in the “UnknownWords” container. Again, try to avoid duplicate
code by adding another private member function called by the display functions from setp-5 and 6.
Step 7 (2 marks)
Implement the “DisplayOriginalText()” public member function. To do this declare a local
map; container and do the same as you did with Step 6, except you should also iterate
the vectors in the KnownWords and UnknownWords containers and add the pair
for all words and their positions. This will sort the words into their original text order based on the
word position numbers. The screen output should look something like below.
the game of life written by ian sharpe the game of life was invented by the mathematician john
conway and first reached a wide public when it was written up in scientific american in 1970 or
thereabouts in those days it was mostly played on squared paper nowadays computers take all the
hard work out of this fascinating invention to some it's nothing more than a toy to others it and
related cellular automata are subjects for serious study in this implementation the screen is
divided into a grid of cells 40 wide by 24 deep a cell may be live (red or dead (white) you begin by
creating the first generation of live cells or seed . . .
/***************************************************************************
* main.cpp
*
***************************************************************************/
#include
#include "wordstats.h"
using namespace std;
int main(){
WordStats ws;
cout << "Begin Text File Analyser Tests ";
cout << "Step-1 Reading and displaying dictionary ";
ws.ReadDictionary();
ws.DisplayDictionary();
cout << "Step-2 Reading words from text file ";
ws.ReadTxtFile();
cout << "Step-3 Displaying known words: ";
ws.DisplayKnownWordStats();
cout << "Step-4 Displaying unknown words: ";
ws.DisplayUnknownWordStats();
cout << "Step-5 Displaying most frequent known words ";
ws.DisplayMostFreqKnownWords();
cout << "Step-6 Displaying most frequent unknown words ";
ws.DisplayMostFreqUnknownWords();
cout << "Step-7 Displaying original text ";
ws.DisplayOriginalText();
cout << " End Text File Analyser Tests ";
return 0;
}
/**********************************************************************
* wordstats.cpp
*
**********************************************************************/
#include
#include
#include
#include
#include
#include "wordstats.h"
using namespace std;
WordStats::WordStats(){
strcpy(Filename,"testdata.txt");
}
// Reads dictionary.txt into Dictionary
void WordStats::ReadDictionary(){
}
// Displays Dictwords 10 at a time
void WordStats::DisplayDictionary(){
}
// Reads textfile into KnownWords and UnknownWords
void WordStats::ReadTxtFile(){
}
// Displays stats of words in KnownWords
void WordStats::DisplayKnownWordStats(){
}
// Displays stats of words in Unknownwords
void WordStats::DisplayUnknownWordStats(){
}
// Displays 20 most frequent words in KnownWords
void WordStats::DisplayMostFreqKnownWords(){
}
// Displays 20 most frequent words in UnknownWords
void WordStats::DisplayMostFreqUnknownWords(){
}
// Displays original text from KnownWords & UnknownWords
void WordStats::DisplayOriginalText(){
}
// ============ Private Fns ========================
// add your private fns here...
/**********************************************************************
* wordstats.h
*
**********************************************************************/
#ifndef WORDSTATS_H_
#define WORDSTATS_H_
#include
#include
#include
#include
#include
using namespace std;
// Definition of data contained in BinaryTree
typedef map > WordMap;
typedef WordMap::iterator WordMapIter;
class WordStats
{
public:
WordStats();
void ReadDictionary();
void DisplayDictionary();
void ReadTxtFile();
void DisplayKnownWordStats();
void DisplayUnknownWordStats();
void DisplayMostFreqKnownWords();
void DisplayMostFreqUnknownWords();
void DisplayOriginalText();
private:
WordMap KnownWords;
WordMap UnknownWords;
set Dictionary;
char Filename[256];
//add your private fns here
};
#endif
Explanation / Answer
//wordstats.h
#ifndef WORDSTATS_H_
#define WORDSTATS_H_
#include <map>
#include <vector>
#include <string>
#include <set>
using namespace std;
// Definition of data contained in BinaryTree
typedef map<string, vector<int>> WordMap;
typedef WordMap::iterator WordMapIter;
class WordStats
{
public:
WordStats();
void ReadDictionary();
void DisplayDictionary();
void ReadTxtFile();
void DisplayKnownWordStats();
void DisplayUnknownWordStats();
void DisplayMostFreqKnownWords();
void DisplayMostFreqUnknownWords();
void DisplayOriginalText();
private:
WordMap KnownWords;
WordMap UnknownWords;
set<string> Dictionary;
char Filename[256];
//add your private fns here
void DisplayWordStats(const WordMap& wordMap);
void DisplayMostFreqWords(const WordMap& wordMap);
};
#endif
//wordstats.cpp
#include "wordstats.h"
#include <iostream>
#include <fstream>
#include <algorithm>
using namespace std;
WordStats::WordStats() {
strcpy_s(Filename, "testdata.txt");
}
// Reads dictionary.txt into Dictionary
void WordStats::ReadDictionary() {
ifstream inFile("dictionary.txt");
if (!inFile)
{
cerr << "unable to open the file.";
exit(1);
}
string word;
while (inFile >> word)
{
// Converting to lowercase
transform(word.begin(), word.end(), word.begin(), ::tolower);
Dictionary.emplace(word);
}
inFile.close();
cout << Dictionary.size() << " words read from Dictionary";
}
// Displays Dictwords 10 at a time
void WordStats::DisplayDictionary() {
size_t count = 0;
for (auto val : Dictionary)
{
cout << val << endl;
count++;
if (count == 20)
break;
}
}
// Reads textfile into KnownWords and UnknownWords
void WordStats::ReadTxtFile() {
ifstream inFile(Filename);
if (!inFile)
{
cerr << "unable to open the file.";
exit(1);
}
string word;
int position = 0;
while (inFile >> word)
{
// Converting to lowercase
transform(word.begin(), word.end(), word.begin(), ::tolower);
if (Dictionary.find(word) != Dictionary.end())
{
auto it = KnownWords.find(word);
if (it != KnownWords.end())
{
(*it).second.emplace_back(position);
}
else
{
vector<int> positions;
positions.emplace_back(position);
KnownWords.emplace(word, positions);
}
}
else
{
auto it = UnknownWords.find(word);
if (it != UnknownWords.end())
{
(*it).second.emplace_back(position);
}
else
{
vector<int> positions;
positions.emplace_back(position);
UnknownWords.emplace(word, positions);
}
}
}
}
// Displays stats of words in KnownWords
void WordStats::DisplayKnownWordStats() {
DisplayWordStats(KnownWords);
}
// Displays stats of words in Unknownwords
void WordStats::DisplayUnknownWordStats() {
DisplayWordStats(UnknownWords);
}
// Displays 20 most frequent words in KnownWords
void WordStats::DisplayMostFreqKnownWords() {
DisplayMostFreqWords(KnownWords);
}
// Displays 20 most frequent words in UnknownWords
void WordStats::DisplayMostFreqUnknownWords() {
DisplayMostFreqWords(UnknownWords);
}
// Displays original text from KnownWords & UnknownWords
void WordStats::DisplayOriginalText() {
}
// Displays the Word and coresponding count
void WordStats::DisplayWordStats(const WordMap& wordMap)
{
if (wordMap.empty())
return;
cout << "Word Count";
for (auto word : wordMap) {
cout << word.first << " " << word.second.size();
}
}
void WordStats::DisplayMostFreqWords(const WordMap& wordMap)
{
if (wordMap.empty())
return;
cout << "Word Count";
multimap<int, string> mostFreqWords;
for (auto word : wordMap) {
mostFreqWords.emplace(word.second.size(), word.first);
}
int count = 0;
for (auto w : mostFreqWords) {
cout << w.second << " " << w.first;
count++;
if (count == 20)
break;
}
}
Related Questions
drjack9650@gmail.com
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.