It makes sense to believe that putting lots of people together would increase th
ID: 3314910 • Letter: I
Question
It makes sense to believe that putting lots of people together would increase the murder rate. First let's look at the correlation between the number of people living in the area and the number of murders per year. While a few people might actually look at murder rates when looking for a place to live, I think it makes more sense to use 'Population' as our X variable and 'Murders per million inhabitants per year" as our Y value.
A. What percentage of the differences in murder rates can we explain using population? B. Was this correlation significant in the population? C. What percentage of the differences in murder rates can we explain using % of population with incomes below $5,000?
City Population % incomes below $5,000 % unemployed Murders per million inhabitants per year 1 587000 16.5 6.2 11.2 2 643000 20.5 6.4 13.4 3 635000 26.3 9.3 40.7 4 692000 16.5 5.3 5.3 5 1248000 19.2 7.3 24.8 6 643000 16.5 5.9 12.7 7 1964000 20.2 6.4 20.9 8 1531000 21.3 7.6 35.7 9 713000 17.2 4.9 8.7 10 749000 14.3 6.4 9.6 11 7895000 18.1 6 14.5 12 762000 23.1 7.4 26.9 13 2793000 19.1 5.8 15.7 14 741000 24.7 8.6 36.2 15 625000 18.6 6.5 18.1 16 854000 24.9 8.3 28.9 17 716000 17.9 6.7 14.9 18 921000 22.4 8.6 25.8 19 595000 20.2 8.4 21.7 20 3353000 16.9 6.7 25.7Explanation / Answer
A) I am using R software to solve this problem.
First we can load the data in R using read.table() function as below:
Data <- read.table("Data.txt", header = T, sep=" ")
Correlation between two variables can be calculated using the cor() function in R as below:
cor(Data$Population,Data$MurdersPerMillionInhabitantsPerYear)
-0.0670984
We can fit a linear model in R using the lm() function. Here Population is our X variable and 'Murders per million inhabitants per year' is the Y variable:
fit <- lm(MurdersPerMillionInhabitantsPerYear ~ Population, data = Data)
We can find the coefficient value and the summary of the model using the summary() function in R.
summary(fit)
Call:
lm(formula = MurdersPerMillionInhabitantsPerYear ~ Population,
data = Data)
Residuals:
Min 1Q Median 3Q Max
-15.558 -7.652 -1.124 5.925 19.819
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 2.113e+01 2.992e+00 7.062 1.38e-06 ***
Population -3.892e-07 1.364e-06 -0.285 0.779
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 10.13 on 18 degrees of freedom
Multiple R-squared: 0.004502, Adjusted R-squared: -0.0508
F-statistic: 0.08141 on 1 and 18 DF, p-value: 0.7787
We can see that R squared value is 0.004502. SO only .45% of variance is explained by this model.
B) We can see that the p value is very high at 0.779. So this is not significant.
C) For this we can fit a linear model with '% of population with incomes below $5,000' as X variable as below:
fit1 <- lm(MurdersPerMillionInhabitantsPerYear ~ PercentageIncomesBelow5K, data = Data)
summary(fit1)
Call:
lm(formula = MurdersPerMillionInhabitantsPerYear ~ PercentageIncomesBelow5K,
data = Data)
Residuals:
Min 1Q Median 3Q Max
-9.1663 -2.5613 -0.9552 2.8887 12.3475
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -29.901 7.789 -3.839 0.0012 **
PercentageIncomesBelow5K 2.559 0.390 6.562 3.64e-06 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 5.512 on 18 degrees of freedom
Multiple R-squared: 0.7052, Adjusted R-squared: 0.6889
F-statistic: 43.06 on 1 and 18 DF, p-value: 3.638e-06
We can see that the R squared value is 0.7052. So 70.52% of the differences in murder rates can we explain using % of population with incomes below $5,000
Related Questions
drjack9650@gmail.com
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.