Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

Write a My Grep utility based on C. This MyGrep utility is similar to grep utili

ID: 3577590 • Letter: W

Question

Write a MyGrep utility based on C. This MyGrep utility is similar to grep utility provided by Unix. MyGrep utility takes some options, words and text file as arguments.

$ MyGrep c “This is a list of words” test.txt

Count the occurrences of string “This is a list of words” in the content of file test.txt

$ MyGrep c -i “This is a list of words” test.txt

Count the occurrences of string “This is a list of words” in the content of file test.txt and ignore cases

$ MyGrep o “This is a list of words” test.txt

Output all lines containing “This is a list of words” and highlight the matched

string

$ MyGrep s test.txt

Remove all leading spaces in each line and output the result.

$ MyGrep n test.txt

In the output, add a line number at the beginning of each line.

$MyGrep n “This is a list of words” test.txt

Output the lines containing string “This is a list of words” and add a line number at the beginning of each matched line.

$ MyGrep k “This” “is” “a” “list” “of” “words” test.txt

Count the occurrences for each word and output the words along with their occurrences in a decreasing order.

For example, the output could be

a 20

is 8

list 5

words 3

of 2

this 1

Explanation / Answer

#!/bin/bash

# wf.sh: Crude word frequency analysis on a text file.

# This is a more efficient version of the "wf2.sh" script.

# Check for input file on command-line.

ARGS=1

E_BADARGS=45

E_NOFILE=66

if [ $# -ne "$ARGS" ] # Correct number of arguments passed to script?

then

echo "Usage: `basename $0` filename"

exit $E_BADARGS

fi

if [ ! -f "$1" ]       # Check if file exists.

then

echo "File "$1" does not exist."

exit $E_NOFILE

fi

# main ()

sed -e 's/.//g' -e 's/,//g' -e 's/ /

/g' "$1" | tr 'A-Z' 'a-z' | sort | uniq -c | sort -nr

#                          

#                            Frequency of occurrence

# Filter out periods and commas, and

#+ change space between words to linefeed,

#+ then shift characters to lowercase, and

#+ finally prefix occurrence count and sort numerically.

# Arun Giridhar suggests modifying the above to:

# . . . | sort | uniq -c | sort +1 [-f] | sort +0 -nr

# This adds a secondary sort key, so instances of

#+ equal occurrence are sorted alphabetically.

# As he explains it:

# "This is effectively a radix sort, first on the

#+ least significant column

#+ (word or string, optionally case-insensitive)

#+ and last on the most significant column (frequency)."

#

# As Frank Wang explains, the above is equivalent to

#+       . . . | sort | uniq -c | sort +0 -nr

#+ and the following also works:

#+       . . . | sort | uniq -c | sort -k1nr -k

exit 0

Hire Me For All Your Tutoring Needs
Integrity-first tutoring: clear explanations, guidance, and feedback.
Drop an Email at
drjack9650@gmail.com
Chat Now And Get Quote