I cannot figure out why my random forest grid search hangs. I tried many things suggested on Stackoverflow, but nothing works. First of all, here is my code:
library(data.table)
library(h2o)
library(dplyr)
# Initialise H2O
localH2O = h2o.init(nthreads = -1, min_mem_size = "9240M", max_mem_size = "11336M")
h2o.removeAll()
# Specify some dirs, inputs etc. (not shown)
laufnummer <- 10
set.seed(laufnummer)
maxmodels <- 500
# Convert to h2o
h2o_input <- as.h2o(input)
# Split: 80% = train; 0 = valid; rest = 20% = test
splits <- h2o.splitFrame(h2o_input, c(0.80,0))
train <- h2o.assign(splits[[1]], "train") # 80%
test <- h2o.assign(splits[[3]], "test") # 10%
Set parameters:
# Select range of ntrees
min_ntrees <- 10
max_ntrees <- 2500
stepsize_ntrees <- 20
ntrees_opts <- seq(min_ntrees,max_ntrees, stepsize_ntrees)
# Select range of tries
min_mtries <- 1
max_mtries <- 12
stepsize_mtries <- 1
mtries_opts <- seq(min_mtries,max_mtries, stepsize_mtries)
# Cross-validation number of folds
nfolds <- 5
hyper_params_dl = list(ntrees = ntrees_opts,
mtries = mtries_opts)
search_criteria_dl = list(
strategy = "RandomDiscrete",
max_models = maxmodels)
Finally, the random grid search (this is where it hangs, almost always at 25%)
rf_grid <- h2o.grid(seed = laufnummer,
algorithm = "randomForest",
grid_id = "dlgrid",
x = predictors,
y = response,
training_frame = train,
nfolds = nfolds,
keep_cross_validation_predictions = TRUE,
model_id = "rf_grid",
hyper_params = hyper_params_dl,
search_criteria = search_criteria_dl
)
Here is what I already tried:
- Did not set nthreads in init: no effect.
- Set nthreads to 4: no effect.
- Set lower memory (I have 16 GB): no effect.
- Added parallelism = 0 in grid search: no effect
- Did not use h2o.removeAll(): no effect
- Always used h2o.shutdown(prompt = FALSE) at end: no effect
- Used different version of JDK, R and h2o. (now using the latest ones for all)
The problem is that the grid search progress stops at around 25%, sometimes less.
What does help is to switch the code to GBM instead of RF, but it sometimes hangs there as well (and I need RF!). What also helped was to reduce the number of models to 500 instead of 5000, but only for NN and GBM, not RF.
After trying for some weeks now, I would appreciate any help very much!! Thank you!