Bootstrapping is sampling with replacement from a sample that is representative
ID: 3361775 • Letter: B
Question
Bootstrapping is sampling with replacement from a sample that is representative of the population whose parameters we wish to estimate. You will obtain many random samples with replacement from the sample data and compute the mean of each random sample. For a 80% confidence interval, you will find the cutoff points for the middle 80% of the sample means. That is, you will find the 10th and 90th percentiles. These cutoff points represent the lower and upper bounds of the confidence interval. The following data represents the miles per gallon that 10 individuals experienced with their 2009 Smart car with a 1.0 Liter engine and c transmission 34.0 34.6 39.8 36.6 42.9 46.3 35.3 32.3 43.8 Source: fueleconomy gov 1) Create 1000 means from "bootstrap samples" of size 10. Follow the path Applets-Resampling-Bootstrap a Statistic Find the 10th and 90th percentiles. 2) Find the 80% T-interval for the population mean mpg for all 2009 Smart cars, using the data in the table above. Use StatCrunch and check with the TI calculator. What assumption has to be checked first before you can use the formulas and technology? 3) Compare the two intervals from steps 1 and 2. Are they similar? 4) In what cases is "bootstrapping" most useful? Do we need to use it in this case?Explanation / Answer
Descriptive Statistics for miles/gallon along with bootstrap results based on 1000 bootstrap samples and confidence intervals at 80%
Descriptive Statistics
Statistic
Std. Error
Bootstrapa
Bias
Std. Error
80% Confidence Interval
Lower
Miles/gallon
N
10
0
0
10
10
Range
14.00
Minimum
32.30
Maximum
46.30
Mean
38.7100
1.51397
-.0549
1.4253
36.8110
40.5200
Std. Deviation
4.78759
-.29229
.72332
3.56021
5.43684
Variance
22.921
-2.191
6.351
12.675
29.559
Skewness
.227
.687
.001
.645
-.561
1.043
Kurtosis
-1.439
1.334
.470
1.191
-1.947
.354
Valid N (listwise)
N
10
0
0
10
10
a. Unless otherwise noted, bootstrap results are based on 1000 bootstrap samples
Below R code- generates 1000 bootstrap means along with calculation of 10th percentile and 90th percentile.
###############################################################
y<-c(34.0,34.6,39.8,36.6,42.9,41.5,46.3,35.3,32.3,43.8) ### miles/gallon data
require(boot) ### Install package boot in R
med<- function(x,i) mean(x[i])
b1 <- boot(y,med,1000)
b1
b1$t #### 1000 bootstrap means
quantile(b1$t,0.1) #### 10th percentile
quantile(b1$t,0.9) #### 90th percentile
plot(b1,col=2) ##### Plot of means, sampling distribution of means
The basic idea of bootstrapping is that inference about a population from sample data, (sample population), can be modelled by resampling the sample data and performing inference about a sample from resampled data, (resampled sample). As the population is unknown, the true error in a sample statistic against its population value is unknown. In bootstrap-resamples, the 'population' is in fact the sample, and this is known; hence the quality of inference of the 'true' sample from resampled data, (resampled sample), is measurable.
Descriptive Statistics
Statistic
Std. Error
Bootstrapa
Bias
Std. Error
80% Confidence Interval
Lower
Miles/gallon
N
10
0
0
10
10
Range
14.00
Minimum
32.30
Maximum
46.30
Mean
38.7100
1.51397
-.0549
1.4253
36.8110
40.5200
Std. Deviation
4.78759
-.29229
.72332
3.56021
5.43684
Variance
22.921
-2.191
6.351
12.675
29.559
Skewness
.227
.687
.001
.645
-.561
1.043
Kurtosis
-1.439
1.334
.470
1.191
-1.947
.354
Valid N (listwise)
N
10
0
0
10
10
Related Questions
drjack9650@gmail.com
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.