Quantcast
Channel: Active questions tagged r - Stack Overflow
Viewing all articles
Browse latest Browse all 206430

h2o.gbm() stop training when a specific AUC (or logloss etc) is reached, rather than when a change is reached

$
0
0

In h2o.gbm() one can use stopping_rounds to cease training when an improvement in stopping_metric occurs; however, I'd like the training to stop when a certain threshold is reached, regardless of the improvement.

For example, (taken from https://github.com/h2oai/h2o-3/blob/master/h2o-docs/src/product/tutorials/gbm/gbmTuning.Rmd )

library(h2o)
h2o.init(nthreads = 1)
df <- h2o.importFile(path = "http://s3.amazonaws.com/h2o-public-test-data/smalldata/gbm_test/titanic.csv")
## pick a response for the supervised problem
response <- "survived"
## the response variable is an integer, we will turn it into a categorical/factor for binary classification
df[[response]] <- as.factor(df[[response]])           
## use all other columns (except for the name) as predictors
predictors <- setdiff(names(df), c(response, "name")) 

splits <- h2o.splitFrame(
  data = df, 
  ratios = c(0.6,0.2),   ## only need to specify 2 fractions, the 3rd is implied
  destination_frames = c("train.hex", "valid.hex", "test.hex"), seed = 1234
)
train <- splits[[1]]
valid <- splits[[2]]
test  <- splits[[3]]

Suppose I only need an AUC of 90% (or alternatively I have a model that has already reached 90%). How can I use the stopping_ arguments to cease training once the AUC (of the validation set) exceeds 90%? It seems I can only stop on an improvement.

gbm_a <- h2o.gbm(x = predictors, y = response, training_frame = train)

h2o.auc(h2o.performance(gbm_a, newdata = valid)) 
#> 0.9480135

Viewing all articles
Browse latest Browse all 206430

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>