*ANSWER ONLY IF YOU KNOW RSTUDIO* Now You Try! Use the code in the file Duplex.R
ID: 3359891 • Letter: #
Question
*ANSWER ONLY IF YOU KNOW RSTUDIO*
Now You Try!
Use the code in the file Duplex.R to conduct a hypothesis test to determine if the sale prices of duplexes are normally distributed.
Load the code into RStudio and run through it. You will not need to modify the code to answer the questions below.
Answer the following questions:
a.How many observations are in the data set Duplex?
b. How many observations do we EXPECT in the region between the mean and 0.75 standard deviations above the mean for the sale prices in the Duplex data set? Round your answer to one decimal place.
c.How many observations do we OBSERVE in the region between the mean and 0.75 standard deviations above the mean for the sale prices in the Duplex data set?
d. What is the test statistic? Round your answer to two decimal places.
e.What is the p-value? Round your answer to four decimal places.
Explanation / Answer
The modified code is
#Highlight them and press Run or CTRL+ENTER
require(mosaic)
require(openintro)
require(MASS)
#read in the Ames Housing data set
AmesHousing<-read.csv("http://www.math.usu.edu/cfairbourn/Stat2300/RStudioFiles/data/AmesHousing.csv")
#Data set contains information from the Ames Assessor's Office
#used in computing assessed values for individual residential
#properties sold in Ames, IA from 2006 to 2010.
Duplex<-subset(AmesHousing, AmesHousing$Bldg.Type=="Duplex")
## number of observation
nrow(Duplex)
#Calculate the mean and sd of the variable
s=sd(Duplex$SalePrice)
m=mean(Duplex$SalePrice)
## number of observations between m and 0.75s
o <-m+ 0.75*s
duplex.observe <- subset(Duplex, Duplex$SalePrice >= m & Duplex$SalePrice <= o)
nrow(duplex.observe)
## test for normality .. shapiro wilk test
shapiro.test(Duplex$SalePrice)
The results are
> nrow(Duplex)
[1] 109
> nrow(Duplex)
[1] 109
> nrow(duplex.observe)
[1] 34
> shapiro.test(Duplex$SalePrice)
Shapiro-Wilk normality test
data: Duplex$SalePrice
W = 0.9319, p-value = 3.042e-05
The stat and the p values are given above
we know that about 68% of observations must lie within 1 sd
so above the mean is
68/2 = 34%
so 75% of 34% is 12% approx
so in total there are 109 records in Duplex so 12% of it is
109*0.12 = 13
Related Questions
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.