I'm studying some scores on DNA for which each position has a score. I would like to find a method to know whether some samples are more often likely to have a high score, not in general, but position per position. Some positions are not defined on all samples, and some samples don't have score for a given position.
data.frame('pos'=c(1,2,3,1,2,3,1,2,5), 'sample'=c('A','A','A','B','B','B','C','C','C'), 'score'=c(1,10,5,20,40,10,0.1,5,4))
I'd like to know using a spearman correlation (I'm looking for rankings as there is no real biological reasons to compare position 1 and 2 for instance) whether some samples are more likely to have the "top" scoring values. My difficulty is that I have actually two qualitative values : the sample ID and the position and only one quantitative. I don't manage to indicate to R that I want somehow to group the data by position and then have a ranking on each position to study the correlation of rankings.
Finally I'd like to have a spearman correlation score assessing in that dataset that sample B is the top-scorer on most of the positions.
Any idea on how to achieve that?
Thanks a lot !