I am trying to create a for loop in R that will make a new data frame ("results") when values of a column ("areaName2") in one data frame (df2), matches the value in a column ("ISLAND") from a different data frame (df1). If there are no matches in the first column in df2, then I want it to move on to pair a second set of columns from df2 and df1 (df2:"areaName1 and df1:"ARCHIP"). Again, if there is a match, it should be printed in the new data frame. If again, there is no match, then I want it to move on the a third pair of columns (df2:"Country" and df1:"COUNTRY"). If all columns in df 2 are blank, then I would like to skip that row. If there is some information in one of the columns in df 2, but it doesn't match df1, I would like it to state that somehow if that is possible.
I have made an example of df1, df2, and results:
ID <- c(1,2,3,4,5, 6)
COUNTRY <- c("country1", 'country2', 'country3','country4', 'country5', 'country6')
ARCHIP <- c('archipelago1', 'archipelago2', 'archipelgao3', 'archipelago4', 'archipelago5', 'archipelago6')
ISLAND <- c('someisland1', 'someIsland2', 'someIsland3', 'someIsland4', 'someIsland5', 'someIsland6')
df1 <- data.frame(ID, COUNTRY, ARCHIP, ISLAND)
Sciname <- c("scientificName1", "scientificName2", "scientificName3", "scientificName4", "scientificName5", "scientificName6")
AreaName2 <- c("someIsland1", NA, "someIsland3", NA, NA, 'unrecognisableIsland')
AreaName1 <- c("archipelago1", "archipelago2", "archipelago3", NA, NA, 'archipelago6')
Country <- c("country1", "country2", "country3", 'country4', NA, 'country6')
df2 <- data.frame(Sciname, Country, AreaName1, AreaName2)
Species <- c("scientificName1","scientificName2", "scientificName3", "scientificName4", 'scientificName6')
Location <- c("someIsland1", "archipelago2", "someIsland3", 'country4', 'UNREGOGNISED')
results <- data.frame(Species, Location)
I was thinking that I need to do something along the lines of this for each column set
for (i in df2$AreaName2) {
results[[i]] <- if(df2$AreaName2 %in% df1$ISLAND)
}
But I am not sure how to make it work for each set, or how to make it run though several columns - maybe I should make a for loop for each of the sets of columns I wish to match? Any ideas? Thanks!