I would like to try the R package rehh to perform haplotype homozygosity analyses on population genetic data.
I have faily simple data files I would like to input into rehh but it's not working for me. My genotypes are already coded in the format 0=missing/ 1=ref/ 2=nonref. Simplified datasets similar to mine are coded below. Let's say I wanted to do cross population EHH between the two populations (in this simple example data, I have 2 populations, 5 samples each, with 10 SNPs on 3 chromosomes):
### Generating example data
## Creating example SNP map
V1 <- c("SNP_1", "SNP_2", "SNP_3", "SNP_4", "SNP_5", "SNP_6", "SNP_7", "SNP_8", "SNP_9", "SNP_10")
V2 <- c(1,1,1,1,1,2,2,2,3,3)
V3 <- c(15,28,30,40,47,9,17,22,4,11)
V4 <- c("T", "A", "T", "G", "G", "G", "A", "T", "G", "A")
V5 <- c("C", "T", "C", "A", "A", "A", "T", "A", "A", "C")
example_SNPmap <- data.frame(V1, V2, V3, V4, V5)
## Creating example genotypes
# Population 1
Sample_1a <- c(2,1,1,2,1,1,1,1,2,2)
Sample_2a <- c(1,1,2,1,1,2,2,1,1,2)
Sample_3a <- c(1,2,1,1,2,2,1,1,1,1)
Sample_4a <- c(2,2,1,1,1,2,2,2,1,2)
Sample_5a <- c(2,1,1,2,1,2,1,1,2,2)
example_geno_pop1 <- data.frame(Sample_1a, Sample_2a, Sample_3a, Sample_4a, Sample_5a)
# Population 2
Sample_1b <- c(2,2,1,1,2,2,1,1,2,2)
Sample_2b <- c(1,2,1,1,2,2,1,1,1,2)
Sample_3b <- c(1,1,1,1,2,2,1,1,1,2)
Sample_4b <- c(2,2,1,1,2,2,1,2,1,2)
Sample_5b <- c(1,2,1,1,2,2,1,1,2,2)
example_geno_pop2 <- data.frame(Sample_1b, Sample_2b, Sample_3b, Sample_4b, Sample_5b, header=TRUE)
example_SNPmap
example_geno_pop1
example_geno_pop2
But then when I run data2haplohh to import the files I get an error:
hap <- data2haplohh(hap_file = example_geno_pop1,
map_file = example_SNPmap,
haplotype.in.columns = TRUE,
recode.allele = FALSE,
chr.name = 1)
Error:
Error in read.table(map_file, row.names = 1, colClasses = "character") :
'file' must be a character string or connection
Sorry if I'm missing something simple/ obvious. Any help much appreciated, thanks.