Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

10. (15 points) Refer to the score.csv file, which shows evaluation scores of 74

ID: 3054416 • Letter: 1

Question

10. (15 points) Refer to the score.csv file, which shows evaluation scores of 74

regional stores. The score is between 0 and 1. (Must use R Commander)

A. Analyze the relationship between two variables, store type and region.

B. Compare the distributions of Scorel and Score2.

C. Is there a difference in Score 1 between the four regions?

ID Region Type Score1 Score2 1 KS A 1 0.34919 2 KS A 0.85521 0.94532 3 KS A 0.79998 0.70441 4 KS A 0.82704 0.92754 5 KS B 0.89742 0.92354 6 KS B 0.90846 0.99992 7 KS C 0.53386 0.92049 8 KS C 0.97474 0.95297 9 KS C 1 0.67937 10 KS C 0.71218 0.81025 11 KS A 0.88564 0.87948 12 KS A 0.98708 0.81367 13 KS B 0.82295 0.76798 14 KS B 0.96321 0.79854 15 KS B 1 1 16 KS C 0.89682 0.93192 17 KS C 0.92643 0.76036 18 KS C 0.96061 0.73621 19 KS C 0.95932 0.81113 20 KS A 0.87596 0.88145 21 KS C 1 1 22 KS C 0.51745 0.80966 23 CC C 0.49959 0.3807 24 KS C 0.86401 0.79403 25 KS C 0.94878 0.47601 26 KS C 0.92734 0.72154 27 KS C 1 0.97787 28 KS C 0.88444 0.79155 29 KS C 0.59194 1 30 KS C 1 0.88959 31 CC C 1 1 32 CC B 1 0.88162 33 KW C 0.84675 0.68196 34 KW C 1 0.84788 35 KW A 0.97293 0.46964 36 KW A 0.97967 0.96908 37 KW A 0.9764 0.62001 38 KW B 0.64646 0.88786 39 KW B 0.8989 0.94951 40 KW B 1 1 41 CC C 0.93753 0.85607 42 CC C 0.92651 0.95031 43 CC C 0.75591 0.82727 44 CC C 0.90052 0.81543 45 CC C 1 0.96817 46 CC C 0.91055 0.97896 47 CC C 0.81538 0.54282 48 CC B 0.87846 0.77381 49 CC C 0.99935 0.96025 50 CC C 0.60332 0.98012 51 CC C 0.18641 0.72207 52 CL C 1 1 53 CL A 0.57541 0.88852 54 CL A 0.9769 0.99352 55 CL A 0.64496 0.9926 56 CL B 1 0.9734 57 CL B 0.67154 0.45983 58 CL B 0.99596 0.82571 59 CL B 1 0.5879 60 CL C 0.88008 0.43353 61 CL C 1 1 62 CL A 0.95958 0.82149 63 CL A 0.97523 0.88502 64 CL A 1 0.48574 65 CL A 0.93975 0.55833 66 CL A 1 0.32493 67 CL B 1 0.9607 68 CL B 0.99035 0.90099 69 CL C 0.96456 1 70 KS C 1 1 71 KS A 0.83724 0.42934 72 KS A 0.97511 0.53691 73 KS A 0.90436 0.93548 74 CC C 0.98556 0.95368 30-12 i , S 26 17 28 29 21 22 25 32 33 34 35 36 37 38 40 41 42?44 45 46 47 48 49 50 52 53 55 56 57 58 60 61 62 63 64 65 66 67 68

Explanation / Answer

First select all data in given table,

and run R-command in R-consol as,

p=read.table("clipboard",header=T);p

to get all data in R. then as follows,

1) By using R -software, to find the relationship between two variables, store type and region we use chisqure test of independency as,

c=table(Region,Type);c #contengency table
chisq.test(c) #chisqure test of independency

The output is,

> c=table(Region,Type);c #contengency table
Type
Region A B C
CC 0 2 13
CL 8 6 4
KS 10 5 18
KW 3 3 2
> chisq.test(c) #chisqure test of independency

Pearson's Chi-squared test

data: c
X-squared = 17.309, df = 6, p-value = 0.008211

Here p-value is less than 0.05 so we reject our null hypothesis as the

Region is independent of store type. Thus there is dependency between Region and store type.

2) to compare the distribution of Score1 and Score2 we use kolmogorov-Smirnov test,

R command is,

ks.test(Score1,Score2)

the output is,

> ks.test(Score1,Score2)

Two-sample Kolmogorov-Smirnov test

data: Score1 and Score2
D = 0.24324, p-value = 0.02509
alternative hypothesis: two-sided

since p-value is less than 0.05 we reject null hypothesis as distribution of two variables are same.Thus we conclude that the distribution of Score1 and Score2 is different.

3) to check is there a difference in Score 1 between the four regions we use R-command as,

x1=which(Region=="KS")
x2=which(Region=="CC")
x3=which(Region=="CL")
x4=which(Region=="KW")
a1=Score1[x1]
a2=Score1[x2]
a3=Score1[x3]
a4=Score1[x4]
y=c(a1,a2,a3,a4)
d=data.frame(a4)
treat=rep(c("a1","a2","a3","a4"),c(33,15,18,8))
d2=data.frame(treat,y)
model=aov(y~treat,data=d2)
model
summary(model)

The output is,

Call:
aov(formula = y ~ treat, data = d2)

Terms:
treat Residuals
Sum of Squares 0.0814866 1.7183621
Deg. of Freedom 3 70

Residual standard error: 0.1566781
Estimated effects may be unbalanced
> summary(model)
Df Sum Sq Mean Sq F value Pr(>F)
treat 3 0.0815 0.02716 1.106 0.352
Residuals 70 1.7184 0.02455

Here p-value is greater than 0.05 thus we accept null hypothesis as all treatment(Regional) effects are same.

Thus  there a no difference in Score 1 between the four regions.

Dr Jack
Hire Me For All Your Tutoring Needs
Integrity-first tutoring: clear explanations, guidance, and feedback.
Chat Now And Get Quote