Does Representation of Time Affect Survival Prediction Accuracy? SoarTech teammate examines the primary formulas for answers.
SoarTech’s Michael Sloma, along with external colleagues Fayeq Syed, Mohammedreza Nemati, and Kevin S. Xu explored the accuracy and convenience of the current models for survival prediction in their paper, “Empirical Comparison of Continuous and Discrete-time Representations for Survival Prediction” published for the Association for the Advancement of Artificial Intelligence Symposium, March 22-24.
The team empirically investigated continuous and discrete-time representations for survival prediction to quantify the trade-offs between the two formulations.
The main challenge in survival prediction is the presence of incomplete observations due to censoring. The classical formulation for survival prediction treats the survival time as a continuous outcome, which leads to a censored regression problem. A new formula for addressing the survival prediction problem discretizes time into a finite number of bins and then applies a multi-task binary classification. The discrete-time formulation requires less assumptions, but it also loses information compared to the classical formulation.
The team found that discretizing time does not necessarily decrease prediction accuracy. In fact, it can result in more accurate predictors than continuous-time models. The data demonstrates that the key factor impacting accuracy is the number of time bins used for discretization and therefore, must be tuned as a hyperparameter rather than specified for convenience.