Predictor characteristics necessary for building a clinically useful risk prediction model: a simulation study

UBC Faculty Research and Publications

Predictor characteristics necessary for building a clinically useful risk prediction model: a simulation study Schummers, Laura; Himes, Katherine P; Bodnar, Lisa M; Hutcheon, Jennifer A

Abstract

Background: Compelled by the intuitive appeal of predicting each individual patient’s risk of an outcome, there is a growing interest in risk prediction models. While the statistical methods used to build prediction models are increasingly well understood, the literature offers little insight to researchers seeking to gauge a priori whether a prediction model is likely to perform well for their particular research question. The objective of this study was to inform the development of new risk prediction models by evaluating model performance under a wide range of predictor characteristics. Methods: Data from all births to overweight or obese women in British Columbia, Canada from 2004 to 2012 (n = 75,225) were used to build a risk prediction model for preeclampsia. The data were then augmented with simulated predictors of the outcome with pre-set prevalence values and univariable odds ratios. We built 120 risk prediction models that included known demographic and clinical predictors, and one, three, or five of the simulated variables. Finally, we evaluated standard model performance criteria (discrimination, risk stratification capacity, calibration, and Nagelkerke’s r2) for each model. Results: Findings from our models built with simulated predictors demonstrated the predictor characteristics required for a risk prediction model to adequately discriminate cases from non-cases and to adequately classify patients into clinically distinct risk groups. Several predictor characteristics can yield well performing risk prediction models; however, these characteristics are not typical of predictor-outcome relationships in many population-based or clinical data sets. Novel predictors must be both strongly associated with the outcome and prevalent in the population to be useful for clinical prediction modeling (e.g., one predictor with prevalence ≥20 % and odds ratio ≥8, or 3 predictors with prevalence ≥10 % and odds ratios ≥4). Area under the receiver operating characteristic curve values of >0.8 were necessary to achieve reasonable risk stratification capacity. Conclusions: Our findings provide a guide for researchers to estimate the expected performance of a prediction model before a model has been built based on the characteristics of available predictors.

Item Metadata

Title	Predictor characteristics necessary for building a clinically useful risk prediction model: a simulation study
Creator	Schummers, Laura; Himes, Katherine P; Bodnar, Lisa M; Hutcheon, Jennifer A
Publisher	BioMed Central
Date Issued	2016-09-21
Description	Background: Compelled by the intuitive appeal of predicting each individual patient’s risk of an outcome, there is a growing interest in risk prediction models. While the statistical methods used to build prediction models are increasingly well understood, the literature offers little insight to researchers seeking to gauge a priori whether a prediction model is likely to perform well for their particular research question. The objective of this study was to inform the development of new risk prediction models by evaluating model performance under a wide range of predictor characteristics. Methods: Data from all births to overweight or obese women in British Columbia, Canada from 2004 to 2012 (n = 75,225) were used to build a risk prediction model for preeclampsia. The data were then augmented with simulated predictors of the outcome with pre-set prevalence values and univariable odds ratios. We built 120 risk prediction models that included known demographic and clinical predictors, and one, three, or five of the simulated variables. Finally, we evaluated standard model performance criteria (discrimination, risk stratification capacity, calibration, and Nagelkerke’s r2) for each model. Results: Findings from our models built with simulated predictors demonstrated the predictor characteristics required for a risk prediction model to adequately discriminate cases from non-cases and to adequately classify patients into clinically distinct risk groups. Several predictor characteristics can yield well performing risk prediction models; however, these characteristics are not typical of predictor-outcome relationships in many population-based or clinical data sets. Novel predictors must be both strongly associated with the outcome and prevalent in the population to be useful for clinical prediction modeling (e.g., one predictor with prevalence ≥20 % and odds ratio ≥8, or 3 predictors with prevalence ≥10 % and odds ratios ≥4). Area under the receiver operating characteristic curve values of >0.8 were necessary to achieve reasonable risk stratification capacity. Conclusions: Our findings provide a guide for researchers to estimate the expected performance of a prediction model before a model has been built based on the characteristics of available predictors.
Subject	Epidemiologic methods; Risk prediction model; Discrimination; Risk classification; Model performance; Area under the receiver operating characteristic curve
Genre	Article
Type	Text
Language	eng
Date Available	2018-05-15
Provider	Vancouver : University of British Columbia Library
Rights	Attribution 4.0 International (CC BY 4.0)
DOI	10.14288/1.0366839
URI	http://hdl.handle.net/2429/65898
Affiliation	Medicine, Faculty of; Non UBC; Obstetrics and Gynaecology, Department of
Citation	BMC Medical Research Methodology. 2016 Sep 21;16(1):123
Publisher DOI	10.1186/s12874-016-0223-2
Peer Review Status	Reviewed
Scholarly Level	Faculty
Copyright Holder	The Author(s).
Rights URI	http://creativecommons.org/licenses/by/4.0/
Aggregated Source Repository	DSpace

Open Collections

UBC Faculty Research and Publications