Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

I am trying to break a packet format. The packet format simply packs several fil

ID: 649917 • Letter: I

Question

I am trying to break a packet format. The packet format simply packs several files into one big file. The file contents are plain. But the index data which contain offsets, file sizes and filenames are encrypted. The index data are about 1 KB and compressed by zlib (the DEFLATE algorithm with no zlib or gzip wrapper), and then XOR encrypted. In other words, the key is concatenated with itself many times to get a keystream of equal length as the zlib-compressed data, and then the zlib-compressed data is XORed with this keystream (like a Vigen

Explanation / Answer

If you have known plaintext, namely one input file that is known in its entirety, this is trivial to break. So I'll explore methods that might lead to a break, if you don't know what's in the input file that was compressed.

I suggest that you start by analyzing the DEFLATE stream format carefully (see also these handy notes). This will probably help you derive some consistent relationships that the first handful of bits of the DEFLATE stream must satisfy.

On first glance, it looks to me like the beginning of the DEFLATE stream has the following structure:

As Stephen Tousot explains, in normal operation the first 3 bits of the DEFLATE stream are likely to be 110. (You may want to double-check this with the compression program you are using.)

The next three fields are HLIT (5 bits), HDIST (5 bits), and HCLEN (4 bits). I suggest that you take a bunch of sample inputs that would be typical for your application, compress them with the compression implementation you're using, and look at the typical range of values for HLIT, HDIST, and HCLEN in the resulting DEFLATE stream. I suspect you may find that not all values are equally likely.

After the aforementioned 17 bits is a chunk of 3*(HCLEN+4) bits, which is interpreted as a sequence of 3-bit values. If I understand the DEFLATE algorithm correctly (and I might not), if you permute these appropriately using a fixed permutation found in the DEFLATE specification and ignore all 000-values, then the remaining 3-bit values should be in sorted order. This provides an additional constraint on possible values for those bits.

Then there's some more stuff that I'm ignoring right now.

Hire Me For All Your Tutoring Needs
Integrity-first tutoring: clear explanations, guidance, and feedback.
Drop an Email at
drjack9650@gmail.com
Chat Now And Get Quote