Quantcast
Channel: Active questions tagged r - Stack Overflow
Viewing all articles
Browse latest Browse all 201945

Cross validation in R vs scikit-learn for linear regression R2

$
0
0

I have a simple linear regression model as such:

Y = Mean_energy , X = A + B

My dataset consist of only 20 rows.

Therefore, to obtain R2 of the model, I did a 5-fold cross validation (cv).

To do cv in Python, I used cross_validate function in scikit-learn,
cross_validate(model, X, Y, cv=5, scoring='r2').

To do cv in R, I used
model <- train(Y ~ A + B ,data = df, method = "lm", trControl = train.control)
trControl=trainControl(method = "cv", number = 5). And then use model$resample to check the cv R2.

The R2 results in R seems to fluctuate a lot vs in Python. Any idea why? I have a feeling that the way I do cv in R is wrong.

cv R2 in R:
Fold 1 = 0.6686680
Fold 2 = 0.3571826
Fold 3 = 0.8858084
Fold 4 = 0.7081766
Fold 5 = 0.3101449

cv R2 in Python:
Fold 1 = 0.29353287
Fold 2 = 0.24257606
Fold 3 = 0.38664367
Fold 4 = 0.26943862
Fold 5 = 0.24531835

FYI, for R cross validation I refer to https://quantdev.ssri.psu.edu/tutorials/cross-validation-tutorial


Viewing all articles
Browse latest Browse all 201945

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>