Quantcast
Channel: Active questions tagged r - Stack Overflow
Viewing all articles
Browse latest Browse all 201839

Using mRMRe for feature selection : my categorical target variable is sometimes selected

$
0
0

I have a data frame "data" with 60 rows (=samples) and 20228 columns where the first column is my target variable (an ordered factor : 0 or 1) and the other columns are my features (=numeric). I want to do a feature selection with mRMRe in a loop corresponding to a 5-cross-validation that I do 3 times. I select every time 25 features. Here is the problematic part of my code :

library(caret)
library(mRMRe)

data <- read.csv("home/RNA_seq.csv", row.names=1, sep=";", stringsAsFactors=FALSE)
data <- data.frame(t(data))

features_select <- list()

r <- 5 # 5-cross-validation
t <- 3 # 5-cross-validation done 3 times
  for (j in 1:t){
    for (i in 1:r){
      #5-cross-validation
      train.index <- createFolds(factor(data$Response), k = 5, list = TRUE, returnTrain = TRUE) 
      datatrain <- data[train.index[[i]],]
      datatest  <- data[-train.index[[i]],]

      datatrain[,1] <- factor(datatrain[,1])
      datatrain[,1] <- ordered(datatrain[,1], levels = c("0", "1"))
      datatest[,1] <- factor(datatest[,1])

      #Feature selection
      data.mrmre.train <- mRMR.data(data=datatrain)
      res.fs.mrmr <- mRMR.classic(data=data.mrmre.train, target_indices=1, feature_count=25)
      selected.features.mrmre <- mRMRe::solutions(res.fs.mrmr)
      features_select[[((j-1)*r+i)]] <- res.fs.mrmr@feature_names[unlist(selected.features.mrmre)]
    }
  }

My problem is that sometimes my target variable called "Response"(=column 1 of "data") is selected by mRMRe. By example :

features_select :

[[1]]
 [1] "RNA5SP84""MRPS10P1""IGKV1.13""RNA5SP296""AC079354.1"
 [6] "AL021997.1""TRAJ34""AC009997.1""AC090844.1""RPS29P11"  
[11] "AC092810.1""RNA5SP370""FAM25E""RNA5SP33""AP000873.1"
[16] "RNA5SP379""HSBP1P1""SST""TMSB10P1""RNA5SP335" 
[21] "AC099789.1""RNA5SP327""RNA5SP123""RNA5SP180""TRGJ2"     

[[2]]
 [1] "CT47A2""Response""Response""Response""Response""Response"
 [7] "Response""Response""Response""Response""Response""Response"
[13] "Response""Response""Response""Response""Response""Response"
[19] "Response""Response""Response""Response""Response""Response"
[25] "Response"

This doesn't appear every time for the same value of i and j into the loop. Do you have an idea where is the problem ?

Thank you in advance !


Viewing all articles
Browse latest Browse all 201839

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>