Quantcast
Channel: Active questions tagged r - Stack Overflow
Viewing all articles
Browse latest Browse all 201867

How do I fix this Error in R: [.data.frame(newdata, , object$method$center, drop = FALSE) : undefined columns selected

$
0
0

I am trying to recreate a random forest model from a paper, and the code doesnt seem to work, i am only just learning R and this is very much over my head, but i will try to explain as best I can.

The source code from the paper can be found here: [(https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0181347.s002&type=supplementary)]

The paper supplies two datasets: training and test, and then creates two subsets for each dataset (see bottom of text for head() of the data. Data can be found in supplementary paper here

(should be able to directly copy to a .csv) code is below:

sink("test.txt", split=TRUE)
print("#data process")
data_bin_train<-read.csv("training.csv", head=TRUE)
names(data_bin_train)
data_bin_test<-read.csv("test.csv", head=TRUE)
names(data_bin_test)
dspt_bin_train<-subset(data_bin_train,select=c(-Deamidation))
dspt_bin_test<-subset(data_bin_test,select=c(-Deamidation))
class_bin_train<-subset(data_bin_train, select=c(Deamidation))
class_bin_test<-subset(data_bin_test, select=c(Deamidation))

library("caret")
library("ROCR")
library("pROC")
fitControl <- trainControl(method = "CV",number = 10,returnResamp = "all", verboseIter = FALSE, classProbs = TRUE)
set.seed(2)

this bit works fine. Then the next bit of code is where i get the error:

library("randomForest")
print("#Random Forest binary class via caret (randomForest)")
caret_rf_bin_randomf_cv10 <- train(Deamidation~., data=data_bin_train, method = "rf", preProcess = c("center", "scale"), tuneLength = 10, trControl = fitControl)
caret_rf_bin_randomf_cv10
varImp(caret_rf_bin_randomf_cv10)

rf_bin_Preds <- extractPrediction(list(caret_rf_bin_randomf_cv10),testX=dspt_bin_test[,1:13], testY=class_bin_test[,1]) 

Error in [.data.frame(newdata, , object$method$center, drop = FALSE) : undefined columns selected` Any help would be amazing! The paper used R v 3.1.1 caret_6.0-35, whereas i am running updated versions of both, which is where i believe the error is coming from, but i'm not sure how to fix it, or to be honest what the error even is.

Thank you

TinoMass

below is the `sessionInfo() and Head() for the two data sets

R version 3.5.3 (2019-03-11)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 18362)

Matrix products: default

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] randomForest_4.6-14 pROC_1.15.3         ROCR_1.0-7          gplots_3.0.1.1      caret_6.0-84       
[6] ggplot2_3.2.1       lattice_0.20-38    

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.3         pillar_1.4.3       compiler_3.5.3     gower_0.2.1        plyr_1.8.5         bitops_1.0-6      
 [7] iterators_1.0.12   class_7.3-15       tools_3.5.3        rpart_4.1-13       ipred_0.9-9        lubridate_1.7.4   
[13] lifecycle_0.1.0    tibble_2.1.3       nlme_3.1-137       gtable_0.3.0       pkgconfig_2.0.3    rlang_0.4.2       
[19] Matrix_1.2-15      foreach_1.4.7      rstudioapi_0.10    prodlim_2019.11.13 e1071_1.7-3        withr_2.1.2       
[25] stringr_1.4.0      dplyr_0.8.3        caTools_1.17.1.3   gtools_3.8.1       generics_0.0.2     recipes_0.1.8     
[31] stats4_3.5.3       grid_3.5.3         nnet_7.3-12        tidyselect_0.2.5   data.table_1.12.8  glue_1.3.1        
[37] R6_2.4.1           survival_2.43-3    gdata_2.18.0       lava_1.6.6         reshape2_1.4.3     purrr_0.3.3       
[43] magrittr_1.5       ModelMetrics_1.2.2 scales_1.1.0       codetools_0.2-16   MASS_7.3-51.1      splines_3.5.3     
[49] assertthat_0.2.1   timeDate_3043.102  colorspace_1.4-1   KernSmooth_2.23-15 stringi_1.4.3      lazyeval_0.2.2    
[55] munsell_0.5.0      crayon_1.3.4     

training.txt

PDB   `Residue #` `AA following A… attack_distance Half_life norm_B_factor_C norm_B_factor_CA norm_B_factor_CB norm_B_factor_CG
  <chr>       <dbl> <chr>                      <dbl>     <dbl>           <dbl>            <dbl>            <dbl>            <dbl>
1 11BG           67 GLY                         3.84      1.02           1.46             1.46             1.36             1.38 
2 11BG           17 SER                         4.81     11.8            0.692            0.706            1.18             1.62 
3 11BG           71 CYS                         4.11     55.5            0.174            0.481            0.574            0.782
4 11BG           44 THR                         3.33     49.9           -1.24            -1.30            -1.35            -1.52 
5 11BG           94 CYS                         4.97     60              1.41             1.64             1.92             2.15 
6 11BG           27 LEU                         4.52    119             -0.898           -0.905           -0.820           -0.604

test.txt

PDB   `Residue #` `AA following A… attack_distance Half_life norm_B_factor_C norm_B_factor_CA norm_B_factor_CB norm_B_factor_CG
 <chr>       <dbl> <chr>                      <dbl>     <dbl>           <dbl>            <dbl>            <dbl>            <dbl>
1 1ACC          713 GLY                         3.69      1.45           3.17             3.35              3.63             4.06
2 1ACC          719 GLY                         4.64      1.04           0.688            0.865             1.42             1.83
3 1ACC           28 PHE                         4.81     72.4            1.03             1.06              1.58             1.95
4 1ACC           52 ILE                         4.73    279              0.944            1.13              1.29             1.46
5 1ACC           85 HIS                         3.60      9.7            0.780            0.800             1.16             1.57
6 1ACC          104 LYS                         4.51     55.5            2.22             2.47              2.69             2.91

Viewing all articles
Browse latest Browse all 201867

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>