Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

a) How many different 8-mer sequences of DNA are there? (Hint: There are 16 poss

ID: 147168 • Letter: A

Question

a) How many different 8-mer sequences of DNA are there? (Hint: There are 16 possible dinucleotides and 64 possible trinucleotides.) We can quantify the information-carrying capacity of the nucleic acids in the following way. Each position can be one of four bases, corresponding to two bits of information. (22 = 4). Thus a chain of 5100 nucleotides corresponds to 2 * 5100 = 10,200 bits or 1275 bytes.

b) How many bits of information are stored in an 8-mer DNA sequence? In the E.coli genome? In the human genome?

c) A typical computer CD can store up to 700 MB of information. Compare each of the three previous calculated values to the amount of information that can be stored in a computer CD.

Explanation / Answer

A) 48 = 65,536. In computer terminology, there are 64K 8-mers of DNA.


B) A bit specifies two bases (we can say, A and C) and a second bit specifies the other two (G and T). Two bits are needed to specify a single nucleotide (base pair) in DNA. For example, 00, 01, 10, and 11 could encode A, C, G, and T. An 8-mer stores 16 bits
(216 = 65,536), the E. coli genome (4.6 x 106 bp) stores 9.2 x 106 bits, and the human genome (3.0 x 109 bases) stores 6.0 x 109 bits of genetic information.

C) A standard CD can hold about 700 megabytes, which is equal to 5.6 x 109 bits. A large number of 8-mer sequences could be stored on such a CD. The DNA sequence of E. coli could be written on a single CD. One CD would not be quite enough to record the entire human genome.