1. (70 points) Do not use any kind of software in parts (a)-(g). Show all the de
ID: 2906955 • Letter: 1
Question
1. (70 points) Do not use any kind of software in parts (a)-(g). Show all the details in your calculations. It is known that approximately 55% of high school seniors score 21 or more on the Math portion of the ACT. Suppose you select 15 high school seniors (who took the ACT) from a local high school.
(a) What is the probability that exactly 5 of them scored 21 or more? (b) What is the probability that none of them scored 21 or more?
(c) What is the probability that at least one scored 21 or more? Hint: use the answer from part (b).
(d) Calculate the expected value (mean) and standard deviation of X, the number of high school seniors who scored 21 or more on the Math portion of the ACT.
(e) Suppose you took another sample of 200 high school seniors, and observed that 90 of them scored 21 or more. Would you consider this sample unusual? Why or why not?
1. State the hypotheses.
2. Determine the critical values (use 5% significance level). 3. Calculate the test statistic.
4. State the decision regarding the null hypothesis.
5. Calculate the p-value, and interpret the result.
6. Is the sample unusual? Why or why not?
(f) Using the information from part (e), construct a 95% confidence interval for the corresponding population proportion, and interpret the result.
(g) Considering the result in part (f), if the margin of error is to be reduced to 0.04 (4%), what would be the required sample size?
(h) Consider the following sample data (counts) for three urban school districts and theirseniors’ MATH ACT scores:
You would like to determine if the score is associated with the location (school districts). 1. State the research question,
2. State the hypotheses.
3. Perform an appropriate hypothesis test using appropriate software.
4. Report all the relevant results (p-value, test statistic, critical value, degrees of freedom, etc.), and interpret them.
5. Finally, answer the research question.
2. (50 points) Consider the data file labeled Level of Socializing. Here is a short description of the data set: 1,400 randomly selected adults 25 to 45 years of age, of four different racial groups (350 in each group), from a large metropolitan area were the participants. The level of
Score < 21
Score >= 21
Total
School District A
40
33
73
School District B
31
32
63
School District C
39
25
64
Total
110
90
200
socializing was measured by the number of hours an individual spent in social events and/or social support groups.
(a) Select a simple random sample of 20 subjects from each group. Display your data values (they should be stored in four different columns). There should be a total of 80 data values –20 in each in column. Note: it is important that your selected data are as random as possible. Do not just select the first 20 or the last 20 data values. Use random.org (https://www.random.org/integer-sets/) to obtain 20 random numbers between 1 and 350 (see Week 8 announcement). Provide the numbers you obtained; the numbers obtained will identify the appropriate data/person in the sample (note that random numbers themselves are not the data you need to analyze). You will use the same data in parts (b), (c), (d).
(b) Is there any evidence that the mean level of socializing differs between the four groups. To answer this question,
1. State the hypotheses,
2. Perform an appropriate test of hypotheses,
3. Make a decision,
4. Identify and interpret the p-value (Use a significance level of 5%), 5. State the result (i.e., answer the research question).
(c) Is there a significant difference between the mean level of socializing of Hispanics and Whites? To answer this question,
1. State the hypotheses,
2. Perform an appropriate test of hypotheses,
3. Make a decision,
4. Identify and interpret the p-value (use a significance level of 5%), 5. State the result (i.e., answer the research question).
(d) Find a 95% confidence interval for the population mean level of socializing for Asians (based on the sample you obtained in part (a)), and interpret the result.
(e) Find the population mean (consider the original data set) level of socializing for Asians. Is this population mean included in the confidence interval in part (d)?
3. (40 points) Our societal values: do taller basketball players get better paid? Consider the data set labeled NBA 2008-2009 Data.
(a) Select 25 basketball players (use random.org as explained in Problem 2), and record their heights and annual salary in two columns. Display your data values. There should be 25 data values in each column.
(b) You would like to see whether there is a correlation between the players’ height and annual salary. Let height be the explanatory (X) variable, and annual salary be the response (Y) variable. Use appropriate software to obtain a full regression output. Provide the complete regression output.
(c) Identify the intercept and slope, and write the regression equation. Identify the coefficient of determination, and interpret the result.
(d) Calculate the coefficient of correlation, and interpret the result.
(e) Find a 95% confidence interval for the population slope. Does the population slope exceed 0? To answer this question, state the hypotheses, identify the p-value, and interpret the result.
(f) Provide a scatter diagram produced by EXCEL.
(g) Identify and interpret the greatest positive residual. Provide the complete list of residuals.
4. (40 points) Consider the High School Teachers’ Salaries data set. In particular, focus on the data (given in thousands of dollars) for the state of Wyoming. Do not use any kind of software in this problem.
(a) Calculate the sample mean and median. Interpret the median.
(b) Calculate the sample standard deviation (rounded to 2 decimal places), and interpret the result.
(c) Locate the first and third quartiles, and interpret the results.
(d) Are there any outliers? Why or why not? Be sure to provide appropriate calculations (see page 36 in the Blue book, if necessary).
(e) Determine a 95% confidence interval for the corresponding population mean, and interpret the result.
(f) If the margin of error should not exceed 3, what should be the sample size? Assume the 95% confidence level.
(g) Is there evidence that the population mean annual income is significantly different from $65,000? (Note: the data in the data set are represented in thousands of dollars). To answer this question,
1. State the hypotheses.
2. Calculate the appropriate test statistic.
3. Estimate the p-value, and state your decision (significance level should be 5%). 4. Assess the strength of your decision using the p-value.
5. Explain what it means to reject or not reject the null hypothesis in this case. 6. How does this compare with your answer in part (d)? Be specific and precise.
Score < 21
Score >= 21
Total
School District A
40
33
73
School District B
31
32
63
School District C
39
25
64
Total
110
90
200
Explanation / Answer
Total People = 15
People with 21 or more marks = 15*55% = 8.25. Since no of people cannot me fraction we are taking this number to be 8.
a)
Probability = Favourable Outcome (F) / Total Outcome (T)
T = 15C5 = 3003
F = 8C5 = 56
Hence, P = 56/3003 = 0.019
(b) What is the probability that none of them scored 21 or more?
T = 15C5 = 3003
F = 7C5 = 21 (since all the 5 people are selected from remaining people)
Hence, P = 21/3003 = 0.007
c)
Required Probability = 1 – 0.007 = 0.993
e)
Proportion of people with 20 or higher marks = 90/200 = 0.45
Null and Alternate Hypothesis:
H0: p = 0.55
Ha: p < 0.55
Alpha = 0.05
Test Statisc(z) = (p – p0 )/( p0*(1- p0)/n)1/2 = (0.45 – 0.55 )/( 0.55*(1- 0.55)/200)1/2 = -2.84
Using the z-table,
p-value =P(z<-2.84) = 0.0023
Since, the p-value is less than 0.05, we reject the null hypothesis.
Related Questions
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.