I'm trying to combine two columns of data that essentially contain the same information but some values are missing from each column that the other doesn't have. Column "wasiIQw1" holds the data for half of the group while column w1iq holds the data or the other half of the group.
select(gadd.us,nidaid,wasiIQw1,w1iq)[1:10,]
select(gadd.us,nidaid,wasiIQw1,w1iq)[1:10,]
nidaid wasiIQw1 w1iq
1 45-D11150341 104 NA
2 45-D11180321 82 NA
3 45-D11220022 93 93
4 45-D11240432 118 NA
5 45-D11270422 99 NA
6 45-D11290422 82 82
7 45-D11320321 99 99
8 45-D11500021 99 99
9 45-D11500311 95 95
10 45-D11520011 111 111
select(gadd.us,nidaid,wasiIQw1,w1iq)[384:394,]
nidaid wasiIQw1 w1iq
384 H1900442S NA 62
385 H1930422S NA 83
386 H1960012S NA 89
387 H1960321S NA 90
388 H2020011S NA 96
389 H2020422S NA 102
390 H2040011S NA 102
391 H2040331S NA 94
392 H2040422S NA 103
393 H2050051S NA 86
394 H2050341S NA 98
With the following code I joined df.a (a df with the id and wasiIQw1) with df.b (a df with the id and w1iq) and get the following results.
df.join <- semi_join(df.a,
df.b,
by = "nidaid")
nidaid w1iq
1 45-D11150341 NA
2 45-D11180321 NA
3 45-D11220022 93
4 45-D11240432 NA
5 45-D11270422 NA
6 45-D11290422 82
7 45-D11320321 99
8 45-D11500021 99
9 45-D11500311 95
10 45-D11520011 111
nidaid w1iq
384 H1900442S 62
385 H1930422S 83
386 H1960012S 89
387 H1960321S 90
388 H2020011S 96
389 H2020422S 102
390 H2040011S 102
391 H2040331S 94
392 H2040422S 103
393 H2050051S 86
394 H2050341S 98
All of this works except for the first four "NA"s that won't merge. Other "_join" functions from dplyr have not worked either. Do you have any tips for combining theses two columns so that no data is lost but all "NA"s are filled in if the other column has a present value?