Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

countries <- c(\'China\',\'Hong\',\'India\',\'Iran\',\'Cambodia\',\'Japan\', \'L

ID: 2246761 • Letter: C

Question

countries <- c('China','Hong','India','Iran','Cambodia','Japan', 'Laos', 'Philippines' ,'Vietnam' ,'Taiwan', 'Thailand', 'England' ,'France', 'Germany' ,'Greece','Holand-Netherlands','Hungary','Ireland','Italy','Poland','Portugal','Scotland','Yugoslavia', 'Canada','United-States', 'Columbia','Cuba','Dominican-Republic','Ecuador', 'El-Salvador','Guatemala', 'Haiti','Honduras', 'Mexico','Nicaragua','Outlying-US(Guam-USVI-etc)','Peru', 'Jamaica','Trinadad&Tobago', 'Puerto-Rico')

Asia <- c('China','Hong','India','Iran','Cambodia','Japan', 'Laos' , 'Philippines' ,'Vietnam' ,'Taiwan', 'Thailand')

Europe <- c('England' ,'France', 'Germany' ,'Greece','Holand-Netherlands','Hungary',  'Ireland','Italy','Poland','Portugal','Scotland','Yugoslavia', 'Canada','United-States')

Latin.and.South.America <- c('Columbia','Cuba','Dominican-Republic','Ecuador', 'El-Salvador','Guatemala','Haiti','Honduras', 'Mexico', 'Nicaragua', 'Outlying-US(Guam-USVI-etc)','Peru',  'Jamaica','Trinadad&Tobago', 'Puerto-Rico')
number <- c(1:40)
df <- data.frame(countries, number)
df
str(df)

In my data frame I have currently factors with 40 levels in the countries column. I want to reduce these to only 3 levels namely "Asia", "Europe", and "Latin.and.South.America". How can I do that in R Studio?

Explanation / Answer

countries <- c('China','Hong','India','Iran','Cambodia','Japan', 'Laos', 'Philippines' ,'Vietnam' ,'Taiwan', 'Thailand', 'England' ,'France', 'Germany' ,'Greece','Holand-Netherlands','Hungary','Ireland','Italy','Poland','Portugal','Scotland','Yugoslavia', 'Canada','United-States', 'Columbia','Cuba','Dominican-Republic','Ecuador', 'El-Salvador','Guatemala', 'Haiti','Honduras', 'Mexico','Nicaragua','Outlying-US(Guam-USVI-etc)','Peru', 'Jamaica','Trinadad&Tobago', 'Puerto-Rico')
Asia <- c('China','Hong','India','Iran','Cambodia','Japan', 'Laos' , 'Philippines' ,'Vietnam' ,'Taiwan', 'Thailand')
Europe <- c('England' ,'France', 'Germany' ,'Greece','Holand-Netherlands','Hungary', 'Ireland','Italy','Poland','Portugal','Scotland','Yugoslavia', 'Canada','United-States')
Latin.and.South.America <- c('Columbia','Cuba','Dominican-Republic','Ecuador', 'El-Salvador','Guatemala','Haiti','Honduras', 'Mexico', 'Nicaragua', 'Outlying-US(Guam-USVI-etc)','Peru', 'Jamaica','Trinadad&Tobago', 'Puerto-Rico')
number <- c(1:40)
df <- data.frame(countries, number)
df
str(df)

for(i in 1:length(countries))
{
if(any(Asia==countries[i]))
{
countries[i]='Asia'
}
if(any(Europe==countries[i]))
{
countries[i]='Europe'
}
if(any(Latin.and.South.America==countries[i]))
{
countries[i]='Latin.and.South.America'
}
}

>factor(countries)
[1] Asia Asia Asia   
[4] Asia Asia Asia   
[7] Asia Asia Asia   
[10] Asia Asia Europe   
[13] Europe Europe Europe   
[16] Europe Europe Europe   
[19] Europe Europe Europe   
[22] Europe Europe Europe   
[25] Europe Latin.and.South.America Latin.and.South.America
[28] Latin.and.South.America Latin.and.South.America Latin.and.South.America
[31] Latin.and.South.America Latin.and.South.America Latin.and.South.America
[34] Latin.and.South.America Latin.and.South.America Latin.and.South.America
[37] Latin.and.South.America Latin.and.South.America Latin.and.South.America
[40] Latin.and.South.America
Levels: Asia Europe Latin.and.South.America