Write a C program to take n files containing strings in ascending order (no dupl
ID: 3908138 • Letter: W
Question
Write a C program to take n files containing strings in ascending order (no duplicates within a file) and produce a file out.dat containing a line for each string (in ascending order). Even if a string str appears in multiple files, it should be output only once and, for each string, you should also output the number of files (k) containing the string. This should be done using code similar to:
fprintf(outfp,"%s %d ",str,k);
Getting Started:
1. Your program is to perform only one “heap assisted” merge of all n files simultaneously. At any time, there should be no more than one string from each of the input files being processed by your code. It will be useful to have a table of file pointers and a table of strings. Using a heap implementation with “handles” is highly recommended.
Under no circumstance should your program use multiple binary merges!
.
2. Your program will be driven by a file input.dat:
a. The first line will contain the value for n.
b. Each of the remaining n lines will contain a simple file name, i.e. there will not be a directory path.
c. Each of the n files will contain at least one string. The strings will consist of no more than 50 letters and digits.
input.dat
4
input0.dat
input1.dat
input2.dat
input3.dat
input0.dat
input0.dat
abc
ghi
hij
jkl
mno
p
uv
w
input1.dat
abc
efg
ghi
q
w
input2.dat
def
ghi
mno
rst
x
input3.dat
efg
ghi
mno
rst
w
y
The output should look like this.
out.dat
abc 2
def 1
efg 2
ghi 4
hij 1
jkl 1
mno 3
p 1
q 1
rst 2
uv 1
w 3
x 1
y 1
Pseudo-code:
a. Open input.dat, each of the n files, and out.dat.
b. Prime the heap with the first string from each file. The strings will be the priorities, so you will have a minHeap with the smallest (strcmp()) string conceptually at the root.
c. While at least one file has not been exhausted:
1. Remove (conceptually) the minimum string from the heap.
2. if the minimum string is different from the previous minimum
Output . . .
else
Change k
3. Attempt to read in another string from the same file as the string just removed.
if EOF
heap gets smaller
else
Put string in heap
d. Final clean-up . . . including output of the last string
Explanation / Answer
#include <stdio.h>
#include <string.h>
void swap(char *a, char *b){
char temp[50];
strcpy(temp,a);
strcpy(a,b);
strcpy(b,temp);
}
void heapify(char **arr, int n, int i)
{
int largest = i; // Initialize largest as root
int l = 2*i + 1; // left = 2*i + 1
int r = 2*i + 2; // right = 2*i + 2
if (l < n && strcmp(arr[l],arr[largest]) > 0)
largest = l;
if (r < n && strcmp(arr[r],arr[largest]) > 0)
largest = r;
if (largest != i)
{
swap(arr[i], arr[largest]);
heapify(arr, n, largest);
}
}
void heapSort(char **arr, int n)
{
int i;
for (i = n / 2 - 1; i >= 0; i--)
heapify(arr, n, i);
for (i=n-1; i>=0; i--)
{
swap(arr[0], arr[i]);
heapify(arr, i, 0);
}
}
int main(){
char data[200][50];
int count[200];
int i;
FILE *fp = fopen("input.dat","r");
FILE *fp1 = fopen("out.dat","w");
if (fp == NULL){
printf("Error opening file ");
return 0;
}
char line[100];
int count1 = 0;
int n;
fscanf(fp,"%d",&n);
for (int i = 0; i<n; i++){
fscanf(fp,"%s",line);
FILE *fp2 = fopen(line,"r");
char word[50];
while (!feof(fp2)){
fscanf(fp2,"%s",word);
int found = 0;
for (i = 0; i<count1; i++){
if (strcmp(data[i],word) == 0){
found = 1;
count[i]++;
}
}
if (found== 0){
strcpy(data[count1],word);
count1++;
}
}
}
heapSort((char **)data,count1);
for (i = 0; i<count1; i++){
fprintf(fp1,"%s %d ", data[i],count[i]);
}
return 0;
}
Related Questions
drjack9650@gmail.com
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.