Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

It makes sense to believe that putting lots of people together would increase th

ID: 3314910 • Letter: I

Question

It makes sense to believe that putting lots of people together would increase the murder rate. First let's look at the correlation between the number of people living in the area and the number of murders per year. While a few people might actually look at murder rates when looking for a place to live, I think it makes more sense to use 'Population' as our X variable and 'Murders per million inhabitants per year" as our Y value.

A. What percentage of the differences in murder rates can we explain using population? B. Was this correlation significant in the population? C. What percentage of the differences in murder rates can we explain using % of population with incomes below $5,000?

City Population % incomes below $5,000 % unemployed Murders per million inhabitants per year 1 587000 16.5 6.2 11.2 2 643000 20.5 6.4 13.4 3 635000 26.3 9.3 40.7 4 692000 16.5 5.3 5.3 5 1248000 19.2 7.3 24.8 6 643000 16.5 5.9 12.7 7 1964000 20.2 6.4 20.9 8 1531000 21.3 7.6 35.7 9 713000 17.2 4.9 8.7 10 749000 14.3 6.4 9.6 11 7895000 18.1 6 14.5 12 762000 23.1 7.4 26.9 13 2793000 19.1 5.8 15.7 14 741000 24.7 8.6 36.2 15 625000 18.6 6.5 18.1 16 854000 24.9 8.3 28.9 17 716000 17.9 6.7 14.9 18 921000 22.4 8.6 25.8 19 595000 20.2 8.4 21.7 20 3353000 16.9 6.7 25.7

Explanation / Answer

A) I am using R software to solve this problem.

First we can load the data in R using read.table() function as below:

Data <- read.table("Data.txt", header = T, sep=" ")

Correlation between two variables can be calculated using the cor() function in R as below:

cor(Data$Population,Data$MurdersPerMillionInhabitantsPerYear)

-0.0670984

We can fit a linear model in R using the lm() function. Here Population is our X variable and 'Murders per million inhabitants per year' is the Y variable:

fit <- lm(MurdersPerMillionInhabitantsPerYear ~ Population, data = Data)

We can find the coefficient value and the summary of the model using the summary() function in R.

summary(fit)

Call:

lm(formula = MurdersPerMillionInhabitantsPerYear ~ Population,

data = Data)

Residuals:

Min 1Q Median 3Q Max

-15.558 -7.652 -1.124 5.925 19.819

Coefficients:

Estimate Std. Error t value Pr(>|t|)   

(Intercept) 2.113e+01 2.992e+00 7.062 1.38e-06 ***

Population -3.892e-07 1.364e-06 -0.285 0.779   

---

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 10.13 on 18 degrees of freedom

Multiple R-squared: 0.004502, Adjusted R-squared: -0.0508

F-statistic: 0.08141 on 1 and 18 DF, p-value: 0.7787

We can see that R squared value is 0.004502. SO only .45% of variance is explained by this model.

B) We can see that the p value is very high at 0.779. So this is not significant.

C) For this we can fit a linear model with '% of population with incomes below $5,000' as X variable as below:

fit1 <- lm(MurdersPerMillionInhabitantsPerYear ~ PercentageIncomesBelow5K, data = Data)

summary(fit1)

Call:

lm(formula = MurdersPerMillionInhabitantsPerYear ~ PercentageIncomesBelow5K,

data = Data)

Residuals:

Min 1Q Median 3Q Max

-9.1663 -2.5613 -0.9552 2.8887 12.3475

Coefficients:

Estimate Std. Error t value Pr(>|t|)   

(Intercept) -29.901 7.789 -3.839 0.0012 **

PercentageIncomesBelow5K 2.559 0.390 6.562 3.64e-06 ***

---

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 5.512 on 18 degrees of freedom

Multiple R-squared: 0.7052, Adjusted R-squared: 0.6889

F-statistic: 43.06 on 1 and 18 DF, p-value: 3.638e-06

We can see that the R squared value is 0.7052. So 70.52% of the differences in murder rates can we explain using % of population with incomes below $5,000

Hire Me For All Your Tutoring Needs
Integrity-first tutoring: clear explanations, guidance, and feedback.
Drop an Email at
drjack9650@gmail.com
Chat Now And Get Quote