Quantcast
Channel: Active questions tagged r - Stack Overflow
Viewing all articles
Browse latest Browse all 211971

vegdist function cannot handle datasets of abundance containing 0

$
0
0

As a marine biologist, we need to figure out whether the fish abundance of 4 different fish species counted three times over a year differs from one artifical reef to another (reef A, B, and C) and from one month to another (June, September, November). For each area, 3 different replicates are generated (1, 2, 3). Let's consider the gathered data (including the factors for better understanding) as follows:

data <- as.data.frame(matrix(NA, 27, 4, dimnames =
               list(1:27, c("Diplodus sargus", "Chelon labrosus", "Oblada melanura", "Seriola dumerii"))))
#fish counts
data$`Diplodus sargus` <- as.numeric(c(0,0,0,0,0,0,0,0,0,5,0,0,3,0,0,0,0,1,0,0,0,0,0,0,4,0,0))
data$`Oblada melanura`  <- as.numeric(c(0,0,0,10,0,0,0,0,0,0,0,0,10,5,0,0,0,0,1,0,2,3,0,2,0,0,0))
data$`Chelon labrosus`<- as.numeric(c(0,0,0,0,2,0,6,0,0,0,0,0,3,0,0,2,0,0,0,0,0,3,0,0,0,0,1))
data$`Seriola dumerii` <-as.numeric(c(4,0,2,0,1,1,0,0,9,0,0,0,0,0,3,0,0,7,0,0,0,8,0,0,0,1,0))
#factors
data$reef <- rep(c(rep("A", 3), rep("B",3), rep("C", 3)),3)
data$month <- rep(c(rep("June", 3), rep("September",3), rep("November", 3)),3)
data$combined <- c(rep("JuneA", 3), rep("JuneB",3), rep("JuneC", 3), rep("SepA", 3), rep("SepB",3), rep("SepC", 3),rep("NovA", 3), rep("NovB",3), rep("NOvC", 3))
data$Replicate <- rep(c(rep("1", 3), rep("2", 3), rep("3", 3)))
#square-root data
comp <- sqrt(data[, 1:4])

library(vegan) 
mydist <- vegdist(comp, method = "bray")
pl.clust <- hclust(mydist, method = "complete")
Error in hclust(mydist, method = "complete") : 
  NA/NaN/Inf in foreign function call (arg 11)

The aim is to perform a Permutation ANOVA on the Bray-Curtis similarities of square root-transformed data in order to determine whether samples (assemblages of counted species) differ significantly depending on factors (alone or combined). However, vegdist function cannot handle data set with 0 as it generates vegdist objects containing NaN...which in turn cannot be handled by the adonis function. I thought of simply adding +1 to each counts as it is the differences between the samples that matter and not the absolute values. However, mydist <- ecodist::bcdist(squared_data,rmzero=FALSE) gives a very different result to that first solution. Is anybody familiar with such issue and how to correctly handle it?

Thank you and looking forward to reading you


Viewing all articles
Browse latest Browse all 211971

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>