I have a very large database (with thousands of questions) from a forum where people answer questions and their answers are accepted or not. If someone answers more than once I need to combine their answers and put them at the first time they answered. Here is a made up example of what I'm dealing with
Here is the dataframe
df1 <- data.frame(
questionID = c(1,1,1,1,2,2,2),
userID = c(101, 101, 101, 102, 102,103,102),
accepted=c(0,0,1,0,0,1,0),
answer=c('text1','text2','text3','text4','text5','text6','text7'),
time=c('12:00','12:30',"1:00","1:30","2:00","2:30","3:00"))
Since userID (101) answered question ID (1) 3 times, and the third answer was accepted, I need to concatenate the answers and put this at the earliest time (which is 12:00). The same thing for userID(102) who answered twice and neither was accepted. The result would be like this (with the output dataframe):
out <- data.frame(
questionID = c(1,1,2,2),
userID = c(101, 102, 102,103),
accepted=c(1,0,0,1),
answer=c('text1+text2+text3','text4','text5+text7','text6'),
time=c('12:00',"1:30","2:00","2:30"))
I've seen some solutions for problems like this but none appear to address this precise situation. Is there some way to do this in R?