I have a data frame "data" with 60 rows (=samples) and 20228 columns where the first column is my target variable (an ordered factor : 0 or 1) and the other columns are my features (=numeric). I want to do a feature selection with mRMRe in a loop corresponding to a 5-cross-validation that I do 3 times. I select every time 25 features. Here is the problematic part of my code :
library(caret)
library(mRMRe)
data <- read.csv("home/RNA_seq.csv", row.names=1, sep=";", stringsAsFactors=FALSE)
data <- data.frame(t(data))
features_select <- list()
r <- 5 # 5-cross-validation
t <- 3 # 5-cross-validation done 3 times
for (j in 1:t){
for (i in 1:r){
#5-cross-validation
train.index <- createFolds(factor(data$Response), k = 5, list = TRUE, returnTrain = TRUE)
datatrain <- data[train.index[[i]],]
datatest <- data[-train.index[[i]],]
datatrain[,1] <- factor(datatrain[,1])
datatrain[,1] <- ordered(datatrain[,1], levels = c("0", "1"))
datatest[,1] <- factor(datatest[,1])
#Feature selection
data.mrmre.train <- mRMR.data(data=datatrain)
res.fs.mrmr <- mRMR.classic(data=data.mrmre.train, target_indices=1, feature_count=25)
selected.features.mrmre <- mRMRe::solutions(res.fs.mrmr)
features_select[[((j-1)*r+i)]] <- res.fs.mrmr@feature_names[unlist(selected.features.mrmre)]
}
}
My problem is that sometimes my target variable called "Response"(=column 1 of "data") is selected by mRMRe. By example :
features_select :
[[1]]
[1] "RNA5SP84""MRPS10P1""IGKV1.13""RNA5SP296""AC079354.1"
[6] "AL021997.1""TRAJ34""AC009997.1""AC090844.1""RPS29P11"
[11] "AC092810.1""RNA5SP370""FAM25E""RNA5SP33""AP000873.1"
[16] "RNA5SP379""HSBP1P1""SST""TMSB10P1""RNA5SP335"
[21] "AC099789.1""RNA5SP327""RNA5SP123""RNA5SP180""TRGJ2"
[[2]]
[1] "CT47A2""Response""Response""Response""Response""Response"
[7] "Response""Response""Response""Response""Response""Response"
[13] "Response""Response""Response""Response""Response""Response"
[19] "Response""Response""Response""Response""Response""Response"
[25] "Response"
This doesn't appear every time for the same value of i and j into the loop. Do you have an idea where is the problem ?
Thank you in advance !