Quantcast
Channel: Active questions tagged r - Stack Overflow
Viewing all articles
Browse latest Browse all 204771

Why do my ROC plots and AUC value look good, when my confusion matrix from Random Forests shows that the model is not good at predicting disease?

$
0
0

I'm using the the package randomForest in R to create a model to classify cases into disease (1) or disease free (0):

classify_BV_100t <- randomForest(bv.disease~., data=RF_input_BV_clean, ntree = 100, localImp = TRUE)

print(classify_BV_100t)

Call:
 randomForest(formula = bv.disease ~ ., data = RF_input_BV_clean,      ntree = 100, localImp = TRUE) 
           Type of random forest: classification
                 Number of trees: 100
No. of variables tried at each split: 53

    OOB estimate of  error rate: 8.04%
Confusion matrix:
    0  1 class.error
0 510  7  0.01353965
1  39 16  0.70909091

My confusion matrix shows that the model is good at classifying 0 (no disease), but is very bad as classifying 1 (disease).

But when I plot ROC plots it gives the impression that the model is pretty good.

Here are the 2 different ways I plotted ROC:

  1. (Using https://stats.stackexchange.com/questions/188616/how-can-we-calculate-roc-auc-for-classification-algorithm-such-as-random-forest)

    library(pROC)
    rf.roc<-roc(RF_input_BV_clean$bv.disease, classify_BV_100t$votes[,2])
    plot(rf.roc)
    auc(rf.roc)
    
  2. (Using How to compute ROC and AUC under ROC after training using caret in R?)

    library(ROCR)
    predictions <- as.vector(classify_BV_100t$votes[,2])
    pred <- prediction(predictions, RF_input_BV_clean$bv.disease)
    
    perf_AUC <- performance(pred,"auc") #Calculate the AUC value
    AUC <- perf_AUC@y.values[[1]]
    
    perf_ROC <- performance(pred,"tpr","fpr") #plot the actual ROC curve
    plot(perf_ROC, main="ROC plot")
    text(0.5,0.5,paste("AUC = ",format(AUC, digits=5, scientific=FALSE)))
    

These are the ROC plots from 1 and 2:

ROC plot 1

ROC plot 2

Both methods give me an AUC of 0.8621593.

Does anyone know why the results from the random forest confusion matrix don't seem to add up with the ROC/AUC?


Viewing all articles
Browse latest Browse all 204771

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>