Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

Ls4202_endsem_2017 please answer only question no.4. (5 marks) LS4202: Biostatis

ID: 3060011 • Letter: L

Question

Ls4202_endsem_2017




please answer only question no.4. (5 marks)


LS4202: Biostatistics End-Semester Test. Marks 50 Date: May 5, 2017, 2:00 PM-5:00 PM Instructor: Dr. Robert John Chandran. Answer any 10 questions. Please write brief and precise answers. Marks: 5×10-50 1. Considering dependent (or response) and independent (or predictor) variables that can be categorical or continuous, what are the different statistical models you would use for different combinations of these variables? 2. A researcher has a large number of data pairs (age, height) of humans from birth to 70 years. Show how you can compute the Pearson's Correlation between the two variables. Would you expect it to be positive or negative? Why? What would you suggest to be a major problem with this approach? 3. The regression function is the conditional expectation of Y for any given values of Xi, Xa... X denoted as E(Ylri,T2, Fr) 0 + 1 + + derk. Given this, state the assumptions of the linear regression model regarding the Y, and X,. What is the rationale for fitting the least squares line, and how would you test whether your assumptions regarding the distribution of Y, are reasonably met? 4. For n independent observations Y,Y which are plant growth responses to different hormone concentrations a student derives the least squares regression line Yi-As + what would be a typical null hypothesis, and what is the basis for computing confidence intervals for By and B for hypothesis testing? (GLM) 5. Under what conditions or properties of a response variable Y is a Generalized Linear Model needed over a simple linear regression model? Given the need for a GLM to model your data, what are the basic components of a GLM that you would need?

Explanation / Answer

Answer

Terminology

The following gives a brief explanation of various terms and short form used in presenting the answer.

Estimated Regression of Y on X is given by: Y = 0cap + 1capX, ………………………….(2)

where

1cap = Sxy/Sxx and 0cap = Ybar – 1cap.Xbar..……………………………………………..(3)

Mean X = Xbar = (1/n)sum of xi ………………………………………….……………….(4)

Mean Y = Ybar = (1/n)sum of yi ………………………………………….……………….(5)

Sxx = sum of (xi – Xbar)2 …………………………………………………..………………………………..(6)

Syy = sum of (yi – Ybar)2 …………………………………………………..………………………………..(7)

Sxy = sum of {(xi – Xbar)(yi – Ybar)} …………………………………………………………………….………(8)

All above sums are over i = 1, 2, …., n, n = sample size ……………………………………(9)

Estimate of 2 is given by s2 = (Syy – 1cap2Sxx)/(n - 2)…………………………………..(10)

Standard Error of 1cap is sb, where sb2 = s2/Sxx …………………………………………(11)

Standard Error of 0cap is sa, where sa2 = sb2{(sum of xi2 over i = 1, 2, …., n)/n}……… (12)

Now, to answer Q4

Typical null hypothesis to test if the assumed regression model is effective to predict y based on x:

H0: 1= 0 Vs HA: 1 0.

Test statistic is: t = 1cap/SE(1cap) [vide (11) above]

Crititcal value is upper (/2)% point of tn – 2

Null hypothesis is rejected at level of significance % if |tcal| > critical value. DONE 1

100(1 - )% Confidence Interval (CI) for 1 = 1cap ± {SE(1cap) x tn – 2, /2} DONE 2

100(1 - )% Confidence Interval (CI) for 0 = 0cap ± {SE(0cap) x tn – 2, /2} DONE 3