Archive (2016–2006)

The Importance of Knowing When to Stop

Journal: Methods of Information in Medicine
Subtitle: A journal stressing, for more than 50 years, the methodology and scientific fundamentals of organizing, representing and analyzing data, information and knowledge in biomedicine and health care
ISSN: 0026-1270
Topic:

Focus Theme: Recent Developments in Boosting Methodology
Guest Editors: M. Schmid, T. Hothorn

DOI: https://doi.org/10.3414/ME11-02-0030
Issue: 2012 (Vol. 51): Issue 2 2012
Pages: 178-186

The Importance of Knowing When to Stop

A Sequential Stopping Rule for Component-wise Gradient Boosting

Focus Theme - Recent Developments in Boosting Methodology

Online Supplementary Material

A. Mayr (1), B. Hofner (1), M. Schmid (1)

(1) Institut für Medizininformatik, Biometrie und Epidemiologie, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany

Keywords

Variable Selection, Gradient boosting, resampling methods, early stopping, penalized regression

Summary

Objectives: Component-wise boosting algorithms have evolved into a popular estimation scheme in biomedical regression settings. The iteration number of these algorithms is the most important tuning parameter to optimize their performance. To date, no fully automated strategy for determining the optimal stopping iteration of boosting algorithms has been proposed.

Methods: We propose a fully data-driven sequential stopping rule for boosting algorithms. It combines resampling methods with a modified version of an earlier stopping approach that depends on AIC-based information criteria. The new “subsampling after AIC” stopping rule is applied to component-wise gradient boosting algorithms.

Results: The newly developed sequential stopping rule outperformed earlier approaches if applied to both simulated and real data. Specifically, it improved purely AIC-based methods when used for the microarray-based prediction of the recurrence of metastases for stage II colon cancer patients.

Conclusions: The proposed sequential stopping rule for boosting algorithms can help to identify the optimal stopping iteration already during the fitting process of the algorithm, at least for the most common loss functions.

You may also be interested in...

1.

Original Article

Online Supplementary Material

T. Hepp (1), M. Schmid (2), O. Gefeller (1), E. Waldmann (1), A. Mayr (1, 2)

Methods Inf Med 2016 55 5: 422-430

https://doi.org/10.3414/ME16-01-0033

2.

Focus Theme - Recent Developments in Boosting Methodology

Online Supplementary Material

A. Groll (1), G. Tutz (1)

Methods Inf Med 2012 51 2: 168-177

https://doi.org/10.3414/ME11-02-0021

3.
Method and Application to the Classification of Cancer Types Using Gene Expression Data

Focus Theme - Recent Developments in Boosting Methodology

Online Supplementary Material

Z. Wang (1)

Methods Inf Med 2012 51 2: 162-167

https://doi.org/10.3414/ME11-02-0020