Any help would be greatly appreciated
I have a file exported from a PCR plate software. I have already coded the call for all alleles and have now merged them into one data frame.
I need to create a new variable merging the 3 alleles (G1-1, G1-2, and G2) to get a final genotype.
I then need to count the occurrence of the alleles to generate the other 3 APOL1 risk variables that I need to generate.
Allele logic for final genotype:
+/G2 = (G1-1-1(+) & G1-1-2(+)) & (G1-2-1(+) & G1-2-2(+)) & (occurence of (G2) at either G2-1 or G2-2)
+/+ = (G1-1-1(+) & G1-1-2(+)) & (G1-2-1(+) & G1-2-2(+)) & (G2-1(+) & G2-2(+))
G2/G2 = (G1-1-1(+) & G1-1-2(+)) & (G1-2-1(+) & G1-2-2(+)) & (G2-1(G2) & G2-2(G2))
G1^GM/+ = (occurence of (G1^S342G) at either G1-1-1 or G1-1-2) & (occurence of (G1^I384M) at either G1-2-1 or G1-2-2) & (G2-1(+) & G2-2(+))
G1^G+/+ = (occurence of (G1^S342G) at either G1-1-1 or G1-1-2) & (G1-2-1(+) & G1-2-2(+)) & (G2-1(+) & G2-2(+))
G1^GM/G1^GM = (occurence of (G1^S342G) at both G1-1-1 or G1-1-2) & (occurence of (G1^I384M) at both G1-2-1 or G1-2-2) & (G2-1(+) & G2-2(+))
G1^GM/G2 = (occurence of (G1^S342G) at either G1-1-1 or G1-1-2) & (occurence of (G1^I384M) at either G1-2-1 or G1-2-2) & (occurence of (G2) at either G2-1 or G2-2)
G1^G+/G2 = (occurence of (G1^S342G) at either G1-1-1 or G1-1-2) & (G1-2-1(+) & G1-2-2(+)) & (G2-1(+) & G2-2(+))
Original Dataframe structure
Classes ‘tbl_df’, ‘tbl’ and 'data.frame': 28 obs. of 6 variables:
$ G1-1-1 : chr "+""+""+""+" ...
$ G1-1-2 : chr "+""+""+""+" ...
$ G1-2-1 : chr "+""+""+""+" ...
$ G1-2-2 : chr "+""+""+""+" ...
$ G2-1 : chr "+""+""+""+" ...
$ G2-2 : chr "G2""+""G2""G2" ...
The APOL1 Risk variables logic is below:
If (+/+) categorize as 1 in "no APOL1 Risk Alleles"
If (+/G2) or (G1^GM/+) or (G1^G+/+) categorize as 1 in "1 APOL1 Risk Alleles"
If (G1^GM/G1^GM) or (G1^GM/G2) or (G2/G2) categorize as 1 in "2 APOL1 Risk Alleles"