Quantcast
Channel: Active questions tagged r - Stack Overflow
Viewing all articles
Browse latest Browse all 201839

merging incomplete duplicate rows

$
0
0

I have a df with incomplete duplicates. The duplicates are based on 2 columns (dates and co.name) after which the data differs. What I would like to do is "flip a coin" and keep one of the 2 duplicates since there is no way of validating which is correct.

I've thought of subsetting the dataframe by dates and co.name and then merging that back to the original, only keeping one side but was wondering if there is a better way

dates <- c(rep("2019-06-17", 2), rep("2016-01-11", 2), rep("2016-04-11",2), '2016-04-12', '2016-04-12')
co.name <- c(rep("co1", 2), rep("co2", 2), rep("co1",2), 'co1', 'co2')
total <- c(10,10,15,12,10,9,12,14)
new.products <- c(3,0,4,0,2,0,1,4)
df <-data.frame(dates, co.name, total, new.products)

df
       dates co.name total new.products
1 2019-06-17     co1    10            3
2 2019-06-17     co1    10            0
3 2016-01-11     co2    15            4
4 2016-01-11     co2    12            0
5 2016-04-11     co1    10            2
6 2016-04-11     co1     9            0
7 2016-04-12     co1    12            1
8 2016-04-12     co2    14            4   



df %>%
  group_by(co.name, dates) %>%
  filter(n() == 2)

# A tibble: 6 x 4
# Groups:   co.name, dates [3]
  dates      co.name total new.products
  <fct>      <fct>   <dbl>        <dbl>
1 2019-06-17 co1        10            3
2 2019-06-17 co1        10            0
3 2016-01-11 co2        15            4
4 2016-01-11 co2        12            0
5 2016-04-11 co1        10            2
6 2016-04-11 co1         9            0

Expected output:

# A tibble: 5 x 4
  dates      co.name total new.products
  <fct>      <fct>   <dbl>        <dbl>
1 2019-06-17 co1        10            0
2 2016-01-11 co2        12            0
3 2016-04-11 co1         9            0
4 2016-04-11 co1        10            2
5 2016-04-11 co1         9            0

Or

# A tibble: 5 x 4
  dates      co.name total new.products
  <fct>      <fct>   <dbl>        <dbl>
1 2019-06-17 co1        10            3
2 2016-01-11 co2        15            4
3 2016-04-11 co1        10            2
4 2016-04-11 co1        10            2
5 2016-04-11 co1         9            0

Viewing all articles
Browse latest Browse all 201839

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>