I've a dataframe as under
+------+-----+----------+--------+
| from | to | distance | weight |
+------+-----+----------+--------+
| 1 | 8 | 1 | 10 |
| 2 | 6 | 1 | 9 |
| 3 | 4 | 1 | 5 |
| 4 | 5 | 3 | 9 |
| 5 | 6 | 4 | 8 |
| 6 | 2 | 5 | 2 |
| 7 | 8 | 2 | 1 |
| 4 | 3 | 5 | 6 |
| 2 | 1 | 1 | 7 |
| 6 | 8 | 4 | 8 |
| 1 | 7 | 5 | 3 |
| 8 | 4 | 6 | 7 |
| 9 | 5 | 3 | 9 |
| 10 | 3 | 8 | 2 |
+------+-----+----------+--------+
I want to sequentially filter the data based on the criterias below:
- If a number appears in the
tocolumn it shouldn't be repeated in either thetoor thefromcolumn - The number in
fromcan be repeated if its correpondingtois a new value and isn't available in any of the cells in thetocolumn - I want to repeat this process until all the unique values from the
fromandtocombined appear atleast once in either of the columns - If a number in the
fromcolumn is a new number and if its correpondingtovalue is already present in either of the columns then replace thattoand distance value with a blank
So the resulting table would look as under:
+------+-----+----------+--------+
| from | to | Distance | weight |
+------+-----+----------+--------+
| 1 | 8 | 1 | 10 |
| 2 | 6 | 1 | 9 |
| 3 | 4 | 1 | 5 |
| 1 | 7 | 5 | 3 |
| 9 | 5 | 3 | 9 |
| 10 | | | 2 |
+------+-----+----------+--------+