Adding contextual predictors improve the performance of cancer survival models

Authors: Guo Y, Bian J, Li Q, George TJ, Shenkman EA

Category: Early Detection & Risk Prediction, Lifestyles Behavior, Energy Balance & Chemoprevention
Conference Year: 2018

Abstract Body:
Cancer is the second leading cause of death in the US. To improve cancer prognosis and survival rates, a better understanding of multi-level contributory factors associated with cancer survival is needed. However, prior research on cancer survival has primarily focused on factors from the individual level due to limited availability of integrated datasets. In this study, we sought to examine how data integration impacts the performance of cancer survival prediction models. We linked data from 4 different sources and evaluated the performance of cox proportional hazard models for breast, lung, and colorectal cancers under 3 common data integration scenarios. We showed adding additional contextual level predictors to survival models through linking multiple datasets improved model fit and performance. We also showed different representations of the same variable or concept have differential impacts on model performance. When building statistical models for cancer outcomes, it is important to consider cross-level predictor interactions.

Keywords: data heterogeneities, data integration, multilevel data analysis, interactions, model performance