Predictive Models for Kidney Offer Acceptance: Challenges and Strategies
- Carlos Martinez ,
- Md Nasir ,
- Meghana Kshirsagar ,
- Cass McCharen ,
- Rae Shean ,
- Juan M. Lavista Ferres ,
- Rahul Dodhia ,
- Bill Weeks
Journal of Transplantation |
Background
Predicting whether an organ offer will be accepted for transplantation remains challenging for several reasons, including large offer volumes, highly imbalanced observations (more declines than acceptances), and lack of information about the human decision-making process. Offer acceptance models are used for risk-adjusted program evaluations and policy development, but there is a lack of literature on baselines and best practices for predictive applications. We compared a suite of machine learning models, feature sets, and sampling procedures to identify performance impacts when training offer acceptance prediction models.
Methods
We evaluated several kidney offer acceptance models from logistic regression to gradient boosted trees that were trained on donor and candidate characteristics. We then selected the best-performing model and augmented training data with additional features (e.g., distance from the closest airport to the transplant hospital) or additional sampling procedures (e.g., undersampling).
Results
Compared to the baseline logistic regression model (average precision: 0.0645), the XGBoost model offered the best performance improvement over the baseline (average precision: 0.0907). Including transportation-related features in the model further improved model performance (average precision: 0.0940); however, we did not observe substantial model performance differences based on the sampling procedure used.
Conclusions
Leveraging advanced machine learning models and incorporating nonclinical datapoints (like transportation distances) can improve transplant organ offer acceptance prediction models. However, we observed steep tradeoffs between precision and recall as captured in the low average precision scores despite deceptively high AUROCs (baseline AUROC 0.832). Our findings suggest that even the best-performing models would not provide clear, equitable benefits over existing allocation policies. More research is needed before these models are practical for clinical implementation.