R problems opiod <- read.csv(\"https://data.ct.gov/api/views/rybz-nyjw/rows.csv?
ID: 3756588 • Letter: R
Question
R problems
opiod <- read.csv("https://data.ct.gov/api/views/rybz-nyjw/rows.csv?accessType=DOWNLOAD",stringsAsFactors = FALSE)
1.Use the opiod data frame to create a data frame the contains two columns. Column 2 should contain the counts of total deaths in that county. Column 1 should contain the county’s name. You should work with the `Death.County` variable.
2.Add the names “subregion” and “count” to the data frame in 1.
3.Remove the “NOT RECORDED” row from your data frame.
4.Pass your data frame into the function `ct.choropleth` given below. Explain ways to improve the plot to make it more informative.
ct.choropleth <- function(df){
# generate county and state boundaries
ct.state <- map_data("state", region = "connecticut")
ct.county.df <- map_data("county", region = "connecticut")
# convert county names to lower case
county.df <- mutate_all(df, funs(tolower))
# merge data frames to pass a single data frame to ggplot
choropleth <- inner_join(ct.county.df, county.df, by = "subregion")
# convert counts to type numeric
choropleth$count <- as.numeric(choropleth$count)
# generate choropleth
ct.plot <- ggplot(choropleth, aes(long, lat, group = group)) +
geom_polygon(aes(fill = count), alpha = 0.75, color = "white") +
geom_polygon(data = ct.county.df, colour = "white", fill = NA) +
geom_polygon(data = ct.state, color = "black", fill = NA)+
scale_fill_gradient2(low = "yellow", mid = "orange", high = "red") +
ggtitle("Opiod deaths in Connecticut by county") +
labs(fill = "Deaths") +
theme_void()
return(ct.plot)
}
Explanation / Answer
Let me know if you have any doubt.
opiod <- read.csv("https://data.ct.gov/api/views/rybz-nyjw/rows.csv?accessType=DOWNLOAD",stringsAsFactors = FALSE)
opiod$Death.County<-gsub(" FAIRFIELD", "NOT RECORDED", opiod$Death.County)
opiod[opiod$Death.County=="",]$Death.County<-'NA'
opiod$Death.County<-gsub("NA", "NOT RECORDED", opiod$Death.County)
opiod$Death.County<-gsub("USA", "NOT RECORDED", opiod$Death.County)
#1
library('dplyr')
df<-opiod %>%
group_by(Death.County) %>%
summarise(n = n())
#2
names(df)<-c('subregion','count')
#3
df<-df[!df$subregion=='NOT RECORDED',]
#4
library('ggplot2')
ct.choropleth <- function(df){
# generate county and state boundaries
ct.state <- map_data("state", region = "connecticut")
ct.county.df <- map_data("county", region = "connecticut")
# convert county names to lower case
county.df <- mutate_all(df, funs(tolower))
# merge data frames to pass a single data frame to ggplot
choropleth <- inner_join(ct.county.df, county.df, by = "subregion")
# convert counts to type numeric
choropleth$count <- as.numeric(choropleth$count)
# generate choropleth
ct.plot <- ggplot(choropleth, aes(long, lat, group = group)) +
geom_polygon(aes(fill = count), alpha = 0.75, color = "white") +
geom_polygon(data = ct.county.df, colour = "white", fill = NA) +
geom_polygon(data = ct.state, color = "black", fill = NA)+
scale_fill_gradient2(low = "yellow", mid = "orange", high = "red") +
ggtitle("Opiod deaths in Connecticut by county") +
labs(fill = "Deaths") +
theme_void()
return(ct.plot)
}
ct.choropleth(df)
Ways to improve graph:
> We are not able to see the countries name in this existing plot. So, Adding countries legend would be nice.
> Adding actual numbers would be more informative country wise.
Related Questions
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.