Quantcast
Channel: Active questions tagged r - Stack Overflow
Viewing all articles
Browse latest Browse all 205372

Join two data sets based on one column [closed]

$
0
0

I have two data sets (DF1, DF2) of uneven rows and I am trying to merge the two using the unique ID column that is present in both data sets. My condition is that I only want rows with matching IDs that is present in both DF1 and DF2, meaning if there are IDs in DF2 that's not in DF1, they will be removed and vice versa. As well I want all the corresponding columns from DF1 and DF2 to be in the new DF3.

I tried to do a semi_join but it only includes the columns from DF1 and not DF2. As well inner_join returns the data with 50000 variables which does not work because DF1 contains 40,000 variables and DF 2 90,000 variables so the result should be less than 40,000 variables.

test <- semi_join(x = DF1, y = DF2, by = "ID")  
test2 <- inner_join(x = DF1, y = DF2, by = "ID")  

DF 1 and 2 are given and DF3 is my desired result.

DF1
ID                  Name                     Date
1 98251           MacDonald, Nich 100000    2013/07/21      
2 98252           John, B~ 100000           2013/06/10    
3 98253           Larry, B~ 100000          2013/04/13

DF2
ID                  Name                     Action
1 98252           Bond, Nich 100000         Eat     
2 98253           John, B~ 100000           Eat   
3 98256           Larry, B~ 100000          Eat            

DF3
ID                  Name                     Date                Action    
2 98252           John, B~ 100000           2013/06/10             Eat
3 98253           Larry, B~ 100000          2013/04/13             Eat

`


Viewing all articles
Browse latest Browse all 205372

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>