Quantcast
Channel: Active questions tagged r - Stack Overflow
Viewing all articles
Browse latest Browse all 201839

Combine column values with partially matching names into one semicolon-separated field in R

$
0
0

I've looked through SO and have not found any advice that accurately explains what I am looking for.

I have a giant table. The first few columns have information about different expressed transcripts and the SNP which influences it. The remainder of the columns (of which there are around a thousand) are either information about an individual's tissue sample (with a column header such as GTEX.11DXX.1426.SM.5GIDU) or the individual's ID (GTEX.11DXX). The information under these columns contain either the number of transcripts expressed (e.g. 92) at that particular sequence and a binary value representing whether the allele that influences the expression of that transcript is Neandertal inherited or not (1 or 0), respectively.

What I want to do is consolidate the data underneath the binary columns with the data underneath the transcript number columns like so:

GTEX.11DXX.1426.SM.5GIDU
0;25
1;74
1;104
1;92
0;12
...
etc.

I want to accomplish this by partially matching the column name GTEX.11DXX with GTEX.11DXX.1426.SM.5GIDU, and then getting rid of binary columns so it's just the long column names.

I've tried using tidyverse's map(v, ~select_(ovary, ~matches(.))), and it kind of works, but that matches even if a one character is off, like so:

[[49]]
       GTEX.13X6H.1026.SM.5SIBE GTEX.13X6H GTEX.13X6I GTEX.13X6J GTEX.13X6K
    1:                       49          0          0          0          1
    2:                       44          0          0          0          1
    3:                        3          0          0          0          1
    4:                       23          0          0          0          1
    5:                       78          0          0          0          1
   ---                                                                     
80285:                       84          1          0          0          0
80286:                        1          1          0          0          0
80287:                        0          1          0          0          0
80288:                      152          1          0          0          0
80289:                      120          1          0          0          0

Again, I want to to work like this:

       GTEX.13X6H.1026.SM.5SIBE
    1:                     0;49
    2:                     0;44
    3:                      0;3
    4:                     0;23

Thank you


Viewing all articles
Browse latest Browse all 201839

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>