Quantcast
Channel: Active questions tagged r - Stack Overflow
Viewing all articles
Browse latest Browse all 212038

Remove duplicates and manipulate dataframe based on conditions from 2 columns

$
0
0

I've a dataframe as under

+------+-----+----------+--------+
| from | to  | distance | weight |
+------+-----+----------+--------+
|    1 |   8 |        1 |     10 |
|    2 |   6 |        1 |      9 |
|    3 |   4 |        1 |      5 |
|    4 |   5 |        3 |      9 |
|    5 |   6 |        4 |      8 |
|    6 |   2 |        5 |      2 |
|    7 |   8 |        2 |      1 |
|    4 |   3 |        5 |      6 |
|    2 |   1 |        1 |      7 |
|    6 |   8 |        4 |      8 |
|    1 |   7 |        5 |      3 |
|    8 |   4 |        6 |      7 |
|    9 |   5 |        3 |      9 |
|   10 |   3 |        8 |      2 |
+------+-----+----------+--------+

I want to sequentially filter the data based on the criterias below:

  1. If a number appears in the to column it shouldn't be repeated in either the to or the from column
  2. The number in from can be repeated if its correponding to is a new value and isn't available in any of the cells in the to column
  3. I want to repeat this process until all the unique values from the from and to combined appear atleast once in either of the columns
  4. If a number in the from column is a new number and if its correponding to value is already present in either of the columns then replace that to and distance value with a blank

So the resulting table would look as under:

+------+-----+----------+--------+
| from | to  | Distance | weight |
+------+-----+----------+--------+
|    1 |   8 |        1 |     10 |
|    2 |   6 |        1 |      9 |
|    3 |   4 |        1 |      5 |
|    1 |   7 |        5 |      3 |
|    9 |   5 |        3 |      9 |
|   10 |     |          |      2 |
+------+-----+----------+--------+

Viewing all articles
Browse latest Browse all 212038

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>