Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

1. (1 point) A file contains only colons, spaces, newlines, commas, and digits i

ID: 3720859 • Letter: 1

Question

1. (1 point) A file contains only colons, spaces, newlines, commas, and digits in the following frequency: colon(100), space(605), the newline(100), comma(705), 0(431), 1(242), 2(176), 3(59), 4(185), 5(250), 6(174), 7(199), 8(205), 9(217). Construct the Huffman code. (Think about) Part of the encoded file must be a header indicating the Hoffman code. Give a method for constructing the header of size at most O(N) (in addition to the symbols), where N is the number of symbols. (Think about) Complete the proof that Huffman algorithm generates an optimal prefix code. (Think about) Show that if the symbols are sorted by frequency, Huffman's algorithm can be implemented in linear time.

Explanation / Answer

1)The huffman coding algorithm begins with a forest of trees consisting of one vertex each.

At each step we combine two trees having the least total weight into a single tree by introducing a new root and placing the larger weight at its left subtree and the tree with smaller weight at its right subtree.

For the given problem, we have:

Continuing this way we get the following tree:

For example the code for the value 1 would be: 1011 and for space it would be:110.