Quantcast
Channel: Active questions tagged r - Stack Overflow
Viewing all articles
Browse latest Browse all 201839

combine two similar columns in r

$
0
0

I'm trying to combine two columns of data that essentially contain the same information but some values are missing from each column that the other doesn't have. Column "wasiIQw1" holds the data for half of the group while column w1iq holds the data or the other half of the group.

select(gadd.us,nidaid,wasiIQw1,w1iq)[1:10,]

    select(gadd.us,nidaid,wasiIQw1,w1iq)[1:10,]
         nidaid wasiIQw1 w1iq
1  45-D11150341      104   NA
2  45-D11180321       82   NA
3  45-D11220022       93   93
4  45-D11240432      118   NA
5  45-D11270422       99   NA
6  45-D11290422       82   82
7  45-D11320321       99   99
8  45-D11500021       99   99
9  45-D11500311       95   95
10 45-D11520011      111  111

    select(gadd.us,nidaid,wasiIQw1,w1iq)[384:394,]
       nidaid wasiIQw1 w1iq
384 H1900442S       NA   62
385 H1930422S       NA   83
386 H1960012S       NA   89
387 H1960321S       NA   90
388 H2020011S       NA   96
389 H2020422S       NA  102
390 H2040011S       NA  102
391 H2040331S       NA   94
392 H2040422S       NA  103
393 H2050051S       NA   86
394 H2050341S       NA   98

With the following code I joined df.a (a df with the id and wasiIQw1) with df.b (a df with the id and w1iq) and get the following results.

df.join <- semi_join(df.a,
                     df.b,
                     by = "nidaid")

     nidaid w1iq
1  45-D11150341   NA
2  45-D11180321   NA
3  45-D11220022   93
4  45-D11240432   NA
5  45-D11270422   NA
6  45-D11290422   82
7  45-D11320321   99
8  45-D11500021   99
9  45-D11500311   95
10 45-D11520011  111

    nidaid w1iq
384 H1900442S   62
385 H1930422S   83
386 H1960012S   89
387 H1960321S   90
388 H2020011S   96
389 H2020422S  102
390 H2040011S  102
391 H2040331S   94
392 H2040422S  103
393 H2050051S   86
394 H2050341S   98

All of this works except for the first four "NA"s that won't merge. Other "_join" functions from dplyr have not worked either. Do you have any tips for combining theses two columns so that no data is lost but all "NA"s are filled in if the other column has a present value?


Viewing all articles
Browse latest Browse all 201839

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>