This has been a tricky problem for which I am really excited to hear solutions. I have what I call "double-columns", i.e. columns of which the content can be split into two separate columns.
This is my input:
structure(list(`A1-A2` = c(2, 1, 1), `A1-A3` = c(2, 1, 2)), row.names = c(NA,
-3L), class = c("tbl_df", "tbl", "data.frame"))
# A tibble: 3 x 2
`A1-A2` `A1-A3`
<dbl> <dbl>
1 2 2
2 1 1
3 1 2
For one column, I can demonstrate what I want to do, but not for several:
data %>%
separate(`A1-A2`, into = c("A1", "A2"), sep = ":") %>%
mutate_at(.vars = c(1:2), as.numeric) %>%
mutate(A2 = A1 -1) %>%
mutate(A1 = ifelse(A1 == 2, 0, A1))
# A tibble: 3 x 3
A1 A2 `A1-A3`
<dbl> <dbl> <dbl>
1 0 1 2
2 1 0 1
3 1 0 2
- This splits the
A1-A2
column into two separate columns A1 and A2. - If its value was 1, sets 1 into the left column (A1)
- If its value was 2, sets 1 into the right column (A2) This only works, as you can see in the code above, for splitting 1 double-column.
Two challenges:
How formulate my code in a generic format for any number of double-columns?
How can you avoid problems because several split columns have the same name (e.g. when the double-columns
A1-A2, A1-A3, A2-A3
are split, they will haveA1, A2, A3
occurring twice)??
Approaches in tidyverse (purrr::map)
are preferred, but I am open to other solutions.
Tricky, isn't it?