Comparing split-sample averaged (SSA), cross-validation (CV) and bootstrapping (BS) confidence interval estimation: a simulation study

· 127 words · 1 minute read

For predictive algorithms, assessing model performance is critical. The criteria used to measure the model performance include the area under the curve (AUC) and average positive predictive value (AP), brier score and scaled brier score. However, most studies only provide point estimates of these performance metrics. As mandatory reporting of confidence intervals becomes increasingly popular in medical studies, it is crucial to construct confidence intervals for the performance estimates. There are three popular approaches, including split-sample averaged (SSA), cross-validation (CV) and bootstrapping (BS), for estimating the confidence intervals for those performance metrics. This simulation study aims to investigate the performance of those three methods on estimating the confidence interval using coverage probability, confidence width, as well as bias to evaluate the point estimates.Details can be found here