3. Listed below are several restriction enzymes along with specific nucleotide s
ID: 253233 • Letter: 3
Question
3. Listed below are several restriction enzymes along with specific nucleotide sequence recognized by the restriction enzyme:
Restriction enzyme Recognition sequence
AluI 5’-AGCT-3’
HaeIII 5’-GGCC-3’
HindIII 5’-AAGCTT-3’
BamHI 5’-GGATCC-3’
NotI 5’-GCGGCCGC-3’
Let’s assume that you would like to digest chromosome 1 from the fruit fly Drosophila with these restriction enzymes.
For each of the above enzymes, if we assume that the Drosophila genome is composed of 30% A, 30% T, 20% G, and 20% C, what is the average distance between each of these recognition sequences?
Explanation / Answer
1- AluI 5’-AGCT-3’
the probability of A at the first position is 3/10
the probability of G at the Second position is 2/10
the probability of C at the third position is 2/10
the probability of T at the fourth position is 3/10
AGCT will be 2/10*3/10*3/10*2/10 = 36/10000 or 9/2500
That means that in every 416 base pair we will find one site for AluI
2- HaeIII 5’-GGCC-3’
the probability of G at the first position is 2/10
the probability of G at the Second position is 2/10
the probability of C at the third position is 2/10
the probability of C at the fourth position is /10
AGCT will be 3/10*3/10*3/10*3/10 = 16/10000 or
That means that in every 625 base pair we will find one site for HaeIII
3- HindIII 5’-AAGCTT-3
the probability of A at the first position is 3/10
the probability of A at the Second position is 3/10
the probability of G at the third position is 2/10
the probability of C at the fourth position is 2/10
the probability of T at the fifth position is 3/10
the probability of T at the sixth position is 3/10
So the overall probability of AAGCTT in the every base pair is
3/10*3/10*2/10*2/10*3/10*3/10 = 324/1000000
That means that in every 3086 base pair we will find one site for HindIII
4- BamHI 5’-GGATCC-3’
In case of BamHI
We need to solve this question by probability
the probability of G at the first position is 2/10
the probability of G at the Second position is 2/10
the probability of A at the third position is 3/10
the probability of T at the fourth position is 3/10
the probability of C at the fifth position is 2/10
the probability of C at the sixth position is 2/10
So the probability of GGATCC in the genome will be 2/10*2/10*3/10*3/10*2/10*2/10 = 144/1000000
That means that in every 6944-base pair we will find one site for BamHI
5 -NotI 5’-GCGGCCGC-3’
the probability of G at the first position is 2/10
the probability of C at the Second position is 2/10
the probability of G at the third position is 2/10
the probability of G at the fourth position is 2/10
the probability of C at the fifth position is 2/10
the probability of C at the sixth position is 2/10
the probability of G at the seventh position is 2/10
the probability of C at the eighth position is 2/10
So the probability of GCGGCCGC in the genome will be 2/10*2/10*2/10*/10*2/10*2/10*2/10*2/10 = 256/100000000
That means that in every 390625 base pair we will find one site NotI
Please write me back if you have any doubts and don't forget to hit thumbs up button.
Related Questions
drjack9650@gmail.com
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.