Quantcast
Viewing all articles
Browse latest Browse all 206152

R Conditionally Merge Rows

I have a very large database (with thousands of questions) from a forum where people answer questions and their answers are accepted or not. If someone answers more than once I need to combine their answers and put them at the first time they answered. Here is a made up example of what I'm dealing with

Image may be NSFW.
Clik here to view.
enter image description here

Here is the dataframe

        df1 <- data.frame(
          questionID = c(1,1,1,1,2,2,2),
          userID = c(101, 101, 101, 102, 102,103,102),
          accepted=c(0,0,1,0,0,1,0),
          answer=c('text1','text2','text3','text4','text5','text6','text7'),
          time=c('12:00','12:30',"1:00","1:30","2:00","2:30","3:00"))

Since userID (101) answered question ID (1) 3 times, and the third answer was accepted, I need to concatenate the answers and put this at the earliest time (which is 12:00). The same thing for userID(102) who answered twice and neither was accepted. The result would be like this (with the output dataframe):

Image may be NSFW.
Clik here to view.
enter image description here

     out <- data.frame(
        questionID = c(1,1,2,2),
        userID = c(101, 102, 102,103),
        accepted=c(1,0,0,1),
        answer=c('text1+text2+text3','text4','text5+text7','text6'),
        time=c('12:00',"1:30","2:00","2:30"))

I've seen some solutions for problems like this but none appear to address this precise situation. Is there some way to do this in R?


Viewing all articles
Browse latest Browse all 206152

Trending Articles