Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

1. Describe a scenario where a researcher could use a Goodness of Fit Test to an

ID: 3359211 • Letter: 1

Question

1. Describe a scenario where a researcher could use a Goodness of Fit Test to answer a research question. Fully describe the scenario and the variables involved and explain the rationale for your answer. Why is that test appropriate to use?

2. Describe a scenario where a researcher could use a Test for Independence to answer a research question. Fully describe the scenario and the variables involved and explain the rationale for your answer. Why is that test appropriate to use?

3. Describe a scenario where a researcher could use an Analysis of Variance to answer a research question. Fully describe the scenario and the variables involved and explain the rationale for your answer. Why is that test appropriate to use?

4. The Goodness of Fit Test and Test for Independence both use the same formula to calculate chi-square. Why? I.e., explain the logic of the test.

5. ANOVA is based on an F-ratio that is calculated as the ratio of two variance estimates, the variance between groups and the variance within groups, but enables conclusions to be made about the means of the samples involved. What is the logic of that? I.e., explain the rationale that supports the use of variance estimates.

Explanation / Answer

We are allowed to do 1 question at a time. Post again for second question.

1) I am writing a full example to make you understand better.

The chi-square goodness of fit test is a useful to compare a theoretical model to observed data. This test is a type of the more general chi-square test. As with any topic in mathematics or statistics, it can be helpful to work through an example in order to understand what is happening. We will see an example of the chi-square goodness of fit test.

Consider a standard package of milk chocolate M&Ms.

There are six different colors: red, orange, yellow, green, blue and brown. Suppose that we are curious about the distribution of these colors and ask, do all six colors occur in equal proportion? This is the type of question that can be answered with a goodness of fit test.

SETTING

We begin by noting the setting and why the goodness of fit test is appropriate. Our variable of color is categorical. There are six levels of this variable, corresponding to the six colors that are possible. We will assume that the M&Ms we count will be a simple random sample from the population of all M&Ms.

NULL AND ALTERNATIVE HYPOTHESES

The null and alternative hypotheses for our goodness of fit test reflect the assumption that we are making about the population. Since we are testing whether the colors occur in equal proportions, our null hypothesis will be that all colors occur in the same proportion. More formally, if p1 is the population proportion of red candies, p2 is the population proportion of orange candies, and so on, then the null hypothesis is that p1= p2 = .

. . = p6 = 1/6.

The alternative hypothesis is that at least one of the population proportions is not equal to 1/6.

ACTUAL AND EXPECTED COUNTS

The actual counts are the number of candies for each of the six colors. The expected count refers to what we would expect if the null hypothesis were true. We will let n be the size of our sample.

The expected number of red candies is p1 n or n/6. In fact, for this example, the expected number of candies for each of the six colors is simply n times pi, or n/6.

CHI-SQUARE STATISTIC FOR GOODNESS OF FIT

We will now calculate a chi-square statistic for a specific example. Suppose that we have a simple random sample of 600 M&M candies with the following distribution:

If the null hypothesis were true, then the expected counts for each of these colors would be (1/6) x 600 = 100. We now use this in our calculation of the chi-square statistic.

We calculate the contribution to our statistic from each of the colors. Each is of the form (Actual – Expected)2/Expected.:

We then total all of these contributions and determine that our chi-square statistic is 125.44 + 22.09 + 0.09 + 25 +29.16 + 33.64 =235.42.

DEGREES OF FREEDOM

The number of degrees of freedom for a goodness of fit test is simply one less than the number of levels of our variable. Since there were six colors, we have 6 – 1 = 5 degrees of freedom.

CHI-SQUARE TABLE AND P-VALUE

The chi-square statistic of 235.42 that we calculated corresponds to a particular location on a chi-square distribution with five degrees of freedom. We now need a p-value, to determines the probability of obtaining a test statistic at least as extreme as 235.42 while assuming that the null hypothesis is true.

Microsoft’s Excel can be used for this calculation. We find that our test statistic with five degrees of freedom has a p-value of 7.29 x 10-49. This is an extremely small p-value.

DECISION RULE

We make our decision on whether to reject the null hypothesis based upon the size of the p-value.

Since we have a very miniscule p-value, we reject the null hypothesis. We conclude that M&Ms are not evenly distributed among the six different colors. A follow up analysis could be used to determine a confidence interval for the population proportion of one particular color.