Quantcast
Channel: Active questions tagged r - Stack Overflow
Viewing all articles
Browse latest Browse all 201919

How to select lines of file based on multiple conditions of another file in R?

$
0
0

I have 2 genetic datasets. I filter file1 based on a column in file2. However, I also need to account for a second column in file2 and I'm not sure how to do this.

The condition for file 1 row extraction is that only rows that have a chromosome position either more than 5000 larger or more than 5000 smaller than any chromosome positions for variants on the same chromosome in file 2 are selected.

For example my data looks like:

File 1:

Variant   Chromsome   Chromosome Position  
Variant1      2             14000     
Variant2      1             9000              
Variant3      8             37000          
Variant4      1             21000     

File 2:

Variant  Chromosome  Chromosome Position  
Variant1     1                 10000                   
Variant2     1                 20000                   
Variant3     8                 30000      

Expected output (of variants with a greater than +/-5000 position distance in comparison to any line of file 2 on the same chromosome):

Variant   Chromosome Position     Chromosome
Variant1    14000                  2
Variant3    37000                  8

#Variant1 at 14000, whilst within 5000 + of Variant1 at 10000 in file2 is on a different chromosome and therefore not compared and is kept.
#Variant3 is on the same chromosome as Variant4 in file1 but larger than 5000+ distance and is kept.

I've tried coding using unix, however only got the larger than 5000 +/- filtering for each variant without chromosome consideration and been advised to try coding in R, however I'm new to R and I'm not sure where to start. I assume I need an if statement for "if line of file1 has matching chromosome number as file2, then perform the larger than 5000 +/- filtering within that chromosome number only" with a for loop for going over each row - even just guidance on how to learn how to do this would be appreciated.


Viewing all articles
Browse latest Browse all 201919

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>