I have a tibble that contains all modeling data from cross-validation (CV) for multiple models. Tibble has 5 rows (because of 5 CV splits) and columns that include data from models (for each split separately), predictions, model performance...
I would like to extract the performance column (perf
). Obtaining the results goes like this:
cv_splits$perf$`1`$ranger_model
1
denotes split number and ranger model
name of the model. This gives me the output:
$metrics
# A tibble: 1 x 3
.metric .estimator .estimate
<chr> <chr> <dbl>
1 roc_auc binary 0.654
$conf_mat
Truth
Prediction ok sick
ok 51 7
sick 1 0
$roc_auc
# A tibble: 61 x 3
.threshold specificity sensitivity
<dbl> <dbl> <dbl>
1 -Inf 0 1
2 0.446 0 1
3 0.558 0 0.981
4 0.558 0.143 0.981
5 0.560 0.143 0.962
6 0.563 0.143 0.942
7 0.566 0.143 0.923
8 0.575 0.286 0.923
9 0.598 0.286 0.904
10 0.608 0.286 0.885
# … with 51 more rows
How to transform and save this performance results (preferably as tibble) so I can make quick calculations on them such as (plotting ROC curves, plotting AUC,...)?
I was thinking about having long format tibble as result so the columns would be:
split, model, metric (consists of $metrics, $conf_mat, roc_auc), value (tibble of results)
I'm not sure how to obtain that. I'm also open to suggestion for better formatting of the final results.