Quantcast
Channel: Active questions tagged r - Stack Overflow
Viewing all articles
Browse latest Browse all 205301

leave one out cross validation in R returns a very low accuracy results (Looking for feedback and comments)

$
0
0

I am trying to compute the accuracy of a decision tree on the seeds dataset (Link to the seeds dataset) over 20 iterations, however, I am getting very low overall accuracy (30%-35%). This is what I've done so far:

library(rpart)
seed = read.csv("seeds_dataset.txt",header= F, sep="\t")
colnames(seed)<- c("area", "per.", "comp.", "l.kernel", "w.kernel","asy_coeff", "lenkernel","type")

sampleSize <- nrow(seed)
mat = matrix(nrow=sampleSize, ncol=20) 
for (t in 1:20) {
  testSampleIdx <- sample(nrow(seed), size=sampleSize)
  data <- seed[testSampleIdx,]

  for (i in 1:nrow(data)){
    training = data[-i, ]
    test = data[i, ] 
    classification = rpart(type ~ ., data=training, method="class") 
    prediction = predict(classification, newdata=test, type="class")
    cm = table(test$type, prediction)
    accuracy <- sum(diag(cm))/sum(cm)
    mat[i,t] = accuracy 
  }
}
for (i in 1:ncol(mat)){
  print(paste("accuracy for ",i," iteration ", round((mean(mat[, i]))*100,1), "%", sep=""))
}
print(paste("overall accuracy ", round((mean(mat))*100,1), "%", sep=""))

Can anyone provide me with comments and feedback on the reason causing this low accuracy? Thank you.


Viewing all articles
Browse latest Browse all 205301

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>