Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

I have an English dictionary (text file) and the frequency of 2-grams, 3-grams a

ID: 649760 • Letter: I

Question

I have an English dictionary (text file) and the frequency of 2-grams, 3-grams and 4-grams as the beginning of each word. I need to write an algorithm that, with a given word, calculates the possible anagrams (which have to be into the dictionary) based on transitions probability (my assignment requires that I use this method).

I thought to use the Viterbi Algorithm but I don't know if it is the right choise. What do you think about a simulated annealing algorithm? Can you give me any help or advice?

Explanation / Answer

There's a better way to list all anagrams.

Given a word, compute its "canonical form" by sorting the individual letters into alphabetical order. For instance, BASEBALL becomes AABBELLS. Do this for each word in the dictionary, and build a data structure that maps from a canonical form to the list of words with that canonical form (e.g., from AABBELLS to BASEBALL) -- for instance, you might use a hashtable or a sorted list.

Now, given one English word, if you want to find all anagrams of it, you can rapidly compute all anagrams by computing its canonical form and then looking that up in your data structure.

This doesn't require n-grams, transition probabilities, Viterbi algorithm, or anything fancy like that -- just sorting and simple data structures. Thus, it can be coded up quickly and elegantly; and it will be very efficient.

Hire Me For All Your Tutoring Needs
Integrity-first tutoring: clear explanations, guidance, and feedback.
Drop an Email at
drjack9650@gmail.com
Chat Now And Get Quote