Quantcast
Channel: Active questions tagged r - Stack Overflow
Viewing all articles
Browse latest Browse all 204771

r - Error: `data` and `reference` should be factors with the same levels

$
0
0

I've trained an artificial neural network algorithm with caret and nnet using r. I am trying to generate a meaningful output - using Confusion Matrix, ideally - but continue to get errors such as "data and reference should be factors with the same levels" or "arguments must have the same length".

pitchData <- read.csv(file.choose(), header = T)
summary(pitchData)
set.seed(75)
DataSplit <- createDataPartition(cleanPitch$type, p = 0.75, list = FALSE)
trainData = cleanPitch[DataSplit,]
testData = cleanPitch[-DataSplit,]
#ANN for pitcher's case -- physical description variables only
set.seed(2713)
ANNscout <- train(type ~ code + pitch_type + b_score + b_count + s_count + outs + pitch_num + on_1b + on_2b + on_3b,
                  data = trainData, method = "nnet", trace = FALSE)
summary(ANNscout)
predictScout = predict(ANNscout, newData = testData)
confusionMatrix(testData$type, ANNscout)

The error occurs at confusionMatrix(testData$type, ANNscout). I have also tried confusionMatrix(predictScout, testData$type), as when summarized they have outputs of:

> summary(testData$type)
    B     S     X 
65126 82996 31456 
> summary(predictScout)
     B      S      X 
195279 248965  94492 

and I would think that these are the same factor length, etc.

I have also tried using the table() function as suggested elsewhere, but that does not seem to fix the root issue.

Link to dataset: https://www.kaggle.com/pschale/mlb-pitch-data-20152018#pitches.csv


Viewing all articles
Browse latest Browse all 204771

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>