Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

Given an article such as this one at nytimes.com ( http://www.nytimes.com/2015/0

ID: 671493 • Letter: G

Question

Given an article such as this one at nytimes.com ( http://www.nytimes.com/2015/03/01/books/review/kazuo-ishiguros-the-buried-giant.html?ref=books&_r=0 ), design an algorithm to find the most frequently co-occurring word-pair in this article. Two words are said to co-occur if they appear in the same sentence. You can assume you have access to a subroutine, sentenceSplitter(article), that can accurately segment an article into separate sentences. Please describe your algorithm unambiguously using pseudo code with necessary comments in English. Assume you start with an "article" variable that already contains the full text of an article.

Explanation / Answer

The following is the required algorithm that finds a pair of most frequently co-occured words .

Co-occur (article):

//article contains full of text.

                //tokenize the ‘article’ and save the words occurred in the article into

// an array called words [ ]. Make sure that words [ ] does not contain duplicates

Sort (words)// now, the array contains words in dictionary order(optional)

//store the number of words in a variable

noWords=words.length

// crate a two dimensional array of integers to store the co-occurrence of pair of words.

//initialize the array with zeros

Counter[noWords][noWords]=0

//split the article into separate sentences and store them into a string array

Sentances[]=setanceSplitter(article)

//find the co-occurrence of each pair of words in the article

For 1 = 1 to noWords

                For j=1 to noWords

                                For k=1 to Sentances.length

                                                If words[i] and words[j] occurs in Sentances[k] for i!=j

                                                                //Increment coutner[i][j] by 1

                                                                Counter[i][j]=Coutner[i][j]+1

                                                Endif

                                Endfor

                Endfor

Endfor

//define a temporary integer

Highest=0

//consider two integers x1, x2 to save indexes of most frequently co-occur pairs

//now find the pair (i,j) such that counter[i][j] is highest than any value for i and j.

For i = 1 to noWords

For j = 1 to noWords

                If Counter[i][j] > Highest

                                Highest =Counter[i][j]

                                X1=i

                                X2=j

                Endif

Endfor

Endfor

Print “words[x1] and words[x2] are most frequently occurred words pair”

Hire Me For All Your Tutoring Needs
Integrity-first tutoring: clear explanations, guidance, and feedback.
Drop an Email at
drjack9650@gmail.com
Chat Now And Get Quote