I have a table that looks like this:
> head(dt)
variant_id transcript_id HH HNL NLNL
1: chr10_60842447_A_G_b38 chr10_60871326_60871443 32968;685 1440;20 337;1
2: chr10_60846892_G_A_b38 chr10_60871326_60871443 33157;690 1251;15 337;1
3: chr10_60847284_C_T_b38 chr10_60871326_60871443 33157;690 1251;15 337;1
4: chr10_60849980_T_C_b38 chr10_60871326_60871443 33157;690 1251;15 337;1
5: chr10_60850566_A_T_b38 chr10_60871326_60871443 33157;690 1251;15 337;1
6: chr10_60852394_C_A_b38 chr10_60871326_60871443 33157;690 1251;15 337;1
What I would like to do is take values in column HH
and divide the numbers before the semi-colon by the numbers after the semi-colon. For example, for the first row, I would like to do 32968/685
(which would be 48.13
). Then, I would like to do the same for the values in column NLNL
(so for the first row that would be 337
), and then I would like to subtract the value found from column HH
from the value in column NLNL
, so 337-48.13 = 228.87
. I would then like to take that value in place it into a new column called diff
for all rows.
How would I go about doing this? I can pretty easily figure out how to divide the values of one column with another and put the result in a new column, but I don't know how to extract semi-colon separated values from within a cell and manipulate them.