Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

Linkage disequilibrium (LD) occurs when alleles at different loci are nonrandoml

ID: 94354 • Letter: L

Question

Linkage disequilibrium (LD) occurs when alleles at different loci are nonrandomly associated and are more likely to be inherited together. But how do scientists identify LD? For this question, you will calculate D, the coefficient of linkage disequilibrium to test whether a set of SNPs is in linkage equilibrium or disequilibrium. Suppose there are two variable sites (SNPs) on two different loci, and we collect sequence data from a sample of 2000 people. At SNP1, the nucleotide in the locus sequence is either a G or a C, while at SNP2, the nucleotide is either an A or a T. All combinations of SNP1 and SNP2 nucleotides are possible, and from the data we see that there are 474 individuals with GA haplotype, 611 with a GT haplotype, 142 with a CA haplotype, and 773 individuals with CT haplotype.

A) What are the haploid frequencies?

B) Now, let’s calculate D. D = (g11 x g22) - (g12 x g21). Show work below. Are these loci in LD?

g11 = frequency of GA

g12 = frequency of GT

g21 = frequency of CA

g22 = frequency of CT

Explanation / Answer

Genotypic data

GA = 474 GT = 611 CA = 142 CT = 773

Total = 2000

A) Calculation of haplotype and allele frequencies

Haplotype Frequencies

GA = 474 / 2000 = .2370

GT = 611 / 2000 = .3055

CA = 142 / 2000 = .0710

CT = 773 / 2000 = .3865

Allele frequencies

G = 0.542

C = 0.457

A = 0.308

T = 0.692

B) Now, we put the values in the equation for D to calculate linkage disequilibrium -

D = (P11 * P22) - (P12 * P21)

D = (0.2370 x 0.3865) - (0.3055 x 0.0710) = 0.0699

Now, we have to estimate Dmax. For that we have to put allelic frequencies and value for D in the following equation -

Dmax = min [ (p1q2) or (p2q1) ] [As D is positive]

Dmax = (0.5425 x 0.692) = 0.375 or = (0.4575 x 0.308) = 0.141

Now to calculate D’, we have to put value of D and Dmax which has already calculated in previous step, in the following equation -

D’ = D / Dmax

D’ = 0.0699 / 0.141 = 0.496 = 0.5

Now we calculate coefficient of correlation (r), for that we have to put value of D and allele frequencies calculated in previous steps in the following equation

r = D / (p1*p2*q1*q2)1/2

r = 0.0699 /(0.5425 x 0.4575 x 0.308 x 0.692)1/2

r = 0.0699 / 0.23 = 0.304

r2 = (0.304)2 = 0 .092

Now, to check the significance of LD between loci use following equation

?2 = r 2 N

?2 = 0.092 x 2000 = 184.8 (1 df)

At 184.8 and df of 1, P-value is 0.0001

So, we can conclude based on our calculations that there is a significant LD between loci and it is 50% of the theoretical maximum.

Hire Me For All Your Tutoring Needs
Integrity-first tutoring: clear explanations, guidance, and feedback.
Drop an Email at
drjack9650@gmail.com
Chat Now And Get Quote