In h2o.gbm()
one can use stopping_rounds
to cease training when an improvement in stopping_metric
occurs; however, I'd like the training to stop when a certain threshold is reached, regardless of the improvement.
For example, (taken from https://github.com/h2oai/h2o-3/blob/master/h2o-docs/src/product/tutorials/gbm/gbmTuning.Rmd )
library(h2o)
h2o.init(nthreads = 1)
df <- h2o.importFile(path = "http://s3.amazonaws.com/h2o-public-test-data/smalldata/gbm_test/titanic.csv")
## pick a response for the supervised problem
response <- "survived"
## the response variable is an integer, we will turn it into a categorical/factor for binary classification
df[[response]] <- as.factor(df[[response]])
## use all other columns (except for the name) as predictors
predictors <- setdiff(names(df), c(response, "name"))
splits <- h2o.splitFrame(
data = df,
ratios = c(0.6,0.2), ## only need to specify 2 fractions, the 3rd is implied
destination_frames = c("train.hex", "valid.hex", "test.hex"), seed = 1234
)
train <- splits[[1]]
valid <- splits[[2]]
test <- splits[[3]]
Suppose I only need an AUC of 90% (or alternatively I have a model that has already reached 90%). How can I use the stopping_
arguments to cease training once the AUC (of the validation set) exceeds 90%? It seems I can only stop on an improvement.
gbm_a <- h2o.gbm(x = predictors, y = response, training_frame = train)
h2o.auc(h2o.performance(gbm_a, newdata = valid))
#> 0.9480135