plz describe in detail Ziv-Lempel (LZ) Dictionary Systems withexpalnation and ex
ID: 3618683 • Letter: P
Question
plz describe in detail Ziv-Lempel (LZ) Dictionary Systems withexpalnation and examples?Explanation / Answer
Ziv-Lempel Data Compression: Large text or graphics ?les are often compressedin order to save storage space or to speed up transmission when the ?le is shipped. Most operatingsystems have compression utilities, and some ?le transfer programs automaticallycompress, ship, and uncompress the ?le, without user intervention. The ?eld of textcompression is itself the subject of several books (for example see [8]), and willnot be handled in depth here. However, a popular compression method due to Ziv-Lempel[9, 10] has an e?cient implementation using su?x trees [6], providing anotherillustration of their utility. The Ziv-Lempel compression method is widely used (it is the basisfor the Unix utility compress), although there are actually several variants ofthe method that go by the same name (see [9, 10]). In this section, we present a basicvariant of the method and an e?cient implementation of it using su?x trees. De?nition For any position i in a string S oflength m, de?ne the substring Priori to be the longest pre?x of S[i..m] that also occurs as a substringof S[1..i 1]. For example, if S = abaxcabaxabz then Prior 7 is bax. De?nition For any position i in S, de?ne li as thelength of Priori . For li > 0, de?ne si as the starting position of the leftmost copy of Priori. In the above example, l7 = 3 and s7 = 2. Note that when li > 0, the copy of Priori starting at si istotally contained in S[1..i 1]. The Ziv-Lempel method uses some of the li and si values toconstruct a compressed representation of string S. The basic insight is that if the textS[1..i 1] has been represented (perhaps in compressed form) and li is greater thanzero, then the next li characters of S (substring Priori ) need not be explicitlydescribed. Rather, that substring can be described by the pair (si , li ), pointing to anearlier occurrence of the substring. Following this insight, a compression method couldprocess S left to right, outputting the pair (si , li ) in place of the explicitsubstring S[i..i + li 1] when possible, and outputting the character S(i) when needed. Fulldetails are given in the algorithm below. Compression Algorithm One begin i := 1 Repeat compute li and si if li > 0 then begin output (si ,li ) i := i +li end else begin outputS(i) i := i + 1 end Until i > n end. For example, S = abacabaxabz can be described as ab(1, 1)c(1,3)x(1, 2)z. Of course, in this example the number of symbols used to represent Sdid not decrease, but rather increased! That’s typical of small examples. Butas the string length increases, providing more opportunity for repeating substrings, thecompression im- proves. Moreover, the algorithm could choose to output characterS(i) explicitly whenever li is “small” (the actual rule depends on bitlevel considerations determined by the size of the alphabet, etc.) For a small example wherepositive compression is observed, consider the contrived string S =bababababababababababababababab, represented as ab(1, 2)(1, 4)(1, 8)(1, 16). That representationuses 24 symbols in placeof the original 32 symbols. If we extendthis example to contain k repeated copies of ab, then thecompressed representation contains approximately 5 log2 k symbols,a dramatic reduction in space. To decompress a compressed string, process the compressed stringleft to right, so that any pair (si , li ) in the representation points to asubstring that has already been fully decompressed. That is, assume inductively that the ?rst jterms (single char- acters or s, l pairs) of the compressed string have been processed,yielding characters 1 through i 1 of the original string S.The next term in the compressed string is either character S(i +1), or it is a pair (si , li ) pointing to a substring of Sstrictly before i. In either case, the algorithm has theinformation needed to decompress the j’th term, and since the?rst term in the compressed string is the ?rst character of S, weconclude by induction that the decompression algorithm can obtainthe original string S.
Related Questions
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.