Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

Example: Coronary Heart Disease Data were taken from the Framingham longitudinal

ID: 3072976 • Letter: E

Question

Example: Coronary Heart Disease Data were taken from the Framingham longitudinal study of (Cornfield, 1962). In this study, n 1329 patients were classified by cholesterol level and whether they had been diagnosed with coronary heart disease (CHD) Serum cholesterol (mg/100 cc) 0-199 12 307 319 200-219 220-259 31 439 470 260+ 41 245 286 total CHD no CHD total 92 246 1237 254 1329 Question: Is there any evidence of a relationship between cholesterol and heart disease? Please compare test results using X2, G2 and M2. Draw your conclusion in the context of the research problem. For M2, choose three different sets of scores and conduct a sensitivity analysis Do it after class. Turn in your report with R code attached next class

Explanation / Answer

We can easily calculate the expected cell counts Eij using the Minitab command chisq, as shown below.

MTB > read c1-c4

DATA> 12 8 31 41

DATA> 307 246 439 245

DATA> end

2 rows read.

MTB > chisq c1-c4

Expected counts are as below observed counts

C1     C2     C3     C4     Total

1       12    8      31    41        92

22.08   17.58     32.54    19.80

2      307 246 439   245 1237

296.92 236.42 437.46 266.20

Total          319   254 470 286 1329

ChiSq = 4.604 + 5.223 + 0.072 + 22.704 + 0.342 + 0.388 + 0.005 + 1.689 = 35.028

df = 3

MTB > cdf 35.028;

SUBC> chisq 3.

35.0280 1.0000

The p-value is essentially zero, so the evidence of a relationship is very strong. The same computation is shown below in S-PLUS, using the function chisq.test().

> x_c(12,8,31,41,307,246,439,245)

> x_matrix(x,4,2,byrow=T)

> chisq.test(x)

Pearson’s chi-square test without Yates’ continuity correction

data: x

X-squared = 35.0285, df = 3, p-value = 0

Through the X2 test for independence, we have demonstrated beyond a reasonable doubt that a relationship exists between cholesterol and CHD.

It would make sense to estimate the conditional probabilities of CHD within the four cholesterol groups. To do this, we estimate P(Y = i|Z = j).

P(Y = i|Z = j) = P(Y = i, Z = j) / P(Z = j)

[(nij/n++) / (n+j/n++)] = nij / n+j

12/319 = .038

8/254 = .031

31/470 = .066

41/286 = .143

307/319 = .962

246/254 = .969

439/470 = .934

245/286 = .857

The risk of CHD appears to be essentially constant for the 0–199 and 200–219 groups.

a test of independence for the 2 × 2 table

12

8

307

246

yields X 2 = 0.157, p-value = .69.

For an I × J table, the usual X 2 or G 2 test for independence has

(IJ 1) (I 1) (J 1) = (I 1)(J 1)

12/319 = .038

8/254 = .031

31/470 = .066

41/286 = .143

307/319 = .962

246/254 = .969

439/470 = .934

245/286 = .857

Hire Me For All Your Tutoring Needs
Integrity-first tutoring: clear explanations, guidance, and feedback.
Drop an Email at drjack9650@gmail.com
Chat Now And Get Quote