I have data in the form of ID and food:
adf<-data.frame(ID=c("a","a","a","b","b","b","b","c","c"),
foods=c("apple","orange","banana","apple","banana","tomato","pear","pear","onion"))
I also have a list of required foods that each ID is being measured for completion against:
required_foods<-c("apple","tomato")
I am interested in producing a column called "missing_foods" that houses a comma-separated list of any and all foods in the required_foods
that don't exist in the foods
column of my data, as grouped by ID.
In the desired_output
below is an example of what I'm hoping to accomplish.
desired_output<-data.frame(ID=c("a","a","a","b","b","b","b","c","c"),
foods=c("apple","orange","banana","apple","banana","tomato","pear","pear","onion"),
missing_foods=c("tomato","tomato","tomato","","","","","apple,tomato","apple,tomato"))
My attempts at solving this so far have been fruitless. Ideally, I'm hoping to a dplyr answer that will have the flexibility to allow for required_food lists of varying lengths. I will ultimately be making multiple required_... lists and hoping to produce a new column for each one.
My attempts:
adf2<-adf%>%
group_by(ID)%>%
mutate(missing_foods= !(required_foods %in% foods))
adf2<-adf%>%
group_by(ID)%>%
mutate(missing_foods= paste(!(required_foods %in% foods),sep=","))
adf2<-adf%>%
group_by(ID)%>%
mutate(missing_foods= for (f in 1:length(required_foods)){
ifelse(f %in% required_foods,paste0(""),
paste0(f,","))
})
Any help would be greatly appreciated.