[20 points Collaborative Problem: As some of you know well, and others of you ma
ID: 3720018 • Letter: #
Question
[20 points Collaborative Problem: As some of you know well, and others of you may be interested to learn, a number of languages (including Chinese and Japanese) are written without spaces between the words. Consequently, software that works with text written in these languages must address the word segmentation problem -inferring likely boundaries between consecutive words in the text. If English were written without spaces, the analogous problem would consist of taking a string like "meetateight" and deciding that the best segmentation is "meet at eight" (and not "me et at eight" or "meet ate ight" or any of a huge number of even less plausible alternatives). How could we automate this process? A simple approach that is at least reasonably effective is to find a segmentation that simply max imizes the cumulative "quality" of its individual constituent words. Thus, suppose you are given a black box that, for any string of letters z-x1x2·.. Zk, will return a number quality(x). This number can be either positive or negative; larger numbers correspond to more plausible English words. (So quality("me") would be positive while quality("ight") would be negative.) Given a long string of letters yyy2yn, a segmentation of y is a partition of its letters into contiguous blocks of letters, each block corresponding to a word in the segmentation. The total quality of a segmentation is determined by adding up the qualities of each of its blocks. (So we would get the right answer above provided that quality("meet") + quality("at"quality("eight") was greater than the total quality of any other segmentation of the string.) Give an efficient algorithm that takes a string y and computes a segmentation of maximum total quality. You can treat a single call to the black box computing quality(x) as a single computational step. Prove the correctness of your algorithm and analyze its time complexity.Explanation / Answer
ANSWER
GIVEN BY
Define Subproblem as consider by :
Let TQ[i] be the maximum whole by quality of any segmentation in the string y1y2....yi
Define reappearance relation as consider by :
Let yjyj+1....yi be the last block in the cord y1y2.....yi . Now we can define the reappearance relation as
TQ[i] = max1?j?i {TQ[j-1] + Q( yjyj+1....yi)}, if i ? 1
0 , if i = 0
To discover out the value of TQ[i] we must be acquainted with the worth of TQ[i'] where i' < i, so we can start computing TQ[] as of left to right
BEST_SCORE(y1y2.....yi)
1. TQ[0] ?0
2. for i?1 to n
3. TQ[i]?-?
4. for j?1 to i
5. if TQ[i] < TQ[j-1] + Q(yjyj+1....yi)
6. TQ[i] ?TQ[j-1] + Q(yjyj+1....yi)
7. Record[i] ? j
8. endif
9. end for
10 end for
11 return TQ[n]
BACKTRACK(Record)as consider by
1. segmentation?[]
2. i ?n
3. while i > 0
4. j ?Record[i]
5. add yjyj+1....yi to the begining of the segmentation
6. i ?j-1
7. end while
8. return segmentation
Related Questions
drjack9650@gmail.com
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.