7.9 P07 Searching files for (key, value) pairs Summary: searching text files for
ID: 3590876 • Letter: 7
Question
7.9 P07 Searching files for (key, value) pairs
Summary: searching text files for (key, value) pairs, e.g. movies and their total revenue.
Google collects all sorts of interesting data: what words appear most frequently in searches? What web sites are most commonly searched using Alexa? What are the most popular baby names? For example, the data file "words-by-freq.txt" ranks the English words used most commonly in searches. Here's the start of the file:
The words are listed by rank, so "the" is the most frequently searched word with a frequency of "5627187200". The 2nd most frequently searched word is "of", with a frequency of "3395006400", and so on. The file ends with # and #.
In general, we call this type of file a "(key, value)" file, since the data comes in pairs, and the "key" uniquely identifies the "value". Note that the value is on the left, and the key is on the right. This implies you should view these types of files like this:
Note that you should view the values and keys as strings --- often the values are integers and the keys are strings, but it's dangerous to assume this is always true. Instead, treat all input data as strings, and use string variables.
In this exercise, you're going to write a complete C++ program that searches data from a (key, value) input file, outputting some statistics, values, and ranks. The main() program is written for you (and cannot be modified); your job is to implement the 2 functions in "search.cpp" that do the work of opening, inputting, and searching the input file. First, here's an example input sequence for the program:
The first input is the filename, and the remaining inputs are keys to search for. The keys are followed by #, which stops the program. Here's the correct output given this input sequence:
The file contains 8000 (key, value) pairs, an the top-ranked (first) pair is (the, 5627187200). The program was then asked to search for "headache", then "the", then "C++", and finally "meanwhile". If the key was found, the corresponding value and rank was output.
The tests cases involve 3 input files: "alexa-by-freq.txt", "baby-names-by-freq.txt", and "words-by-freq.txt". The zybook system does not show you the contents of the input files, but the files have been made available on the course web page until "Projects", then "project07-inputfiles". Here's a link to that folder.
BEFORE YOU START
We need to avoid a bug in some browsers. Above the editor pane you'll see the text "Current file: main.cpp", with a little drop-down arrow to the right. Click the drop-down and select "search.cpp". Then click the link "Load default template..." --- this will ensure the file is properly loaded for you to edit; you only need to do this once.
FUNCTIONS TO IMPLEMENT
Here are the functions you need to write, which are declared in "search.cpp". Note that the functions are "returning" values via additional reference parameters. You can read more about reference parameters in section 6.11, but in short all you need to know is that inside the function you need to assign values to the reference parameters, and the value you assign is the value that is "returned". In the case of GetFileStats, you need to assign to firstKey and firstValue. In the case of SearchForKey, you need to assign to value and rank.
PROGRAMMING INSTRUCTIONS
Recall that zyante has 2 modes, Develop mode and Submit mode. In Develop mode you must supply the program input. For this assignment, to get started supply the name of an input file --- "alex-by-freq.txt", "baby-names-by-freq.txt", or "words-by-freq.txt" --- followed by a #:
That will call your first function and output the results. When you are ready to test your 2nd function, add some search keys to the input, such as
In Submit mode you click "Submit for Grading" and see what happens. You are free to submit as many times as you want in this exercise. Recall that zyante only saves the most recent copy that you submitted --- i.e. the last time you clicked the "Run program" or "Submit for grading" button. If you close the browser and reload the web page, this is the version of the program you will see. There is no way to go back to earlier versions, or "save" your current work unless you Run / Submit. For this reason, it is recommended that you save your work in an editor or word processor.
You are free to work outside of zyante in a programming environment of your choice, but you must submit for grading using the zyante system: copy-paste your program into zyante, switch to Submit mode, and Submit for grading. Note that different programming environments have different defaults, so it's quite possible that a program working outside of zyante will fail when run inside zyante. This is not an error in the zyante system --- this means there is a logic error in your program. The most common mistake is failure to initialize a variable. If your program fails to produce output here in zyante, then switch to Develop mode, supply some test inputs, and Run program. Make sure your program compiles, and runs to completion. Add output statements if you need to debug further.
Explanation / Answer
/*search.cpp*/
#include<iostream>
#include<fstream>
#include<string.h>
#include<cstdlib>
using namespace std;
int GetFileStats(string filename, string& firstKey, string& firstValue)
{ int N = 0;
ifstream f;
char *dd;
dd=&filename[0];
f.open(dd);
if(f.is_open())
cout<<"file successfully opened";
else
exit(0);
int i=0;
while(!f.eof())
{
string str;string value;
f>>value>>str;
if(str.length()!=0)
N++;
if(i==0)
{
firstKey.assign(str);
firstValue.assign(value);
}
i++;
}
f.close();
return N;
}
//
// SearchForKey
//
// Opens the given file, searches for the given key, and if found, returns true; if the
// key is not found then false is returned.
//
// If the key is found, the value will be returned via the "value" parameter; likewise
// the rank will be returned via the "rank" parameter. The rank is simply the line #
// where the (key, value) pair was found: 1 for first line, 2 for second line, etc.
// If the key is not found, value and rank should be ignored.
//
// NOTE: it is assumed the file contains at least one (key, value) pair, in this format:
//
// value key
// value key
// value key
// # #
//
// where value and key are one word strings (or numbers that will be input as strings).
// If the file cannot be opened, the program will be exited.
//
bool SearchForKey(string filename, string key, string& value, int& rank)
{
//
// initialize return value / parameters:
string str;string n;int i=0; bool found;
ifstream f;
char *dd;
dd=&filename[0];
f.open(dd);
while(!f.eof())
{
f>>n>>str;
if(str==key){
value=n;
rank=i+1;
break;
}
i++;
}
if(f.eof())
found=false;
else
found=true;
// not found:
//
// TODO: search for the key, and if found, we want to return the associated value.
// But we also need to return the "rank" via the rank parameter; the rank is the
// line number where the (key, value) appears: 1 for first line, 2 for second
// line, and so on. Finally, return true if key was found, false if not.
//
// Don't forget to close the file when you're done.
//
return found;
}
//This is the main.ccp you do not edit.
/*main.cpp*/
#include <iostream>
#include <fstream>
#include <string>
#include <cstdlib>
using namespace std;
//
// function declarations:
//
int GetFileStats(string filename, string& firstKey, string& firstValue);
bool SearchForKey(string filename, string key, string& value, int& rank);
int main()
{
string filename;
cin >> filename;
cout << "**Starting**" << endl;
cout << "File: '" << filename << "'" << endl;
cout << endl;
//
// First, let's get some statistics about the data file:
//
int N, rank;
string key, value;
bool found;
N = GetFileStats(filename, key, value);
cout << "# of entries: " << N << endl;
cout << "Top-ranked: (" << key << ", " << value << ")" << endl;
cout << endl;
//
// Now let's input keys from the keyboard and look for them in the file:
//
cin >> key;
while (key != "#")
{
found = SearchForKey(filename, key, value, rank);
if (found == true)
{
cout << ">>Found key: " << key << endl;
cout << " Value= " << value << endl;
cout << " Rank= " << rank << endl;
}
else // not found:
{
cout << ">>Not found: " << key << endl;
}
cin >> key;
}
cout << endl;
cout << "**End**" << endl;
return 0;
}
Related Questions
drjack9650@gmail.com
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.