Machine learning does not outperform traditional statistical modelling for kidney allograft failure prediction

Agathe Truchot; Marc Raynaud; Nassim Kamar; Maarten Naesens; Christophe Legendre; Michel Delahousse; Olivier Thaunat; Matthias Buchler; Marta Crespo; Kamilla Linhares; Babak J. Orandi; Enver Akalin; Gervacio Soler Pujol; Helio Tedesco Silva; Gaurav Gupta; Dorry L. Segev; Xavier Jouven; Andrew J. Bentall; Mark D. Stegall; Carmen Lefaucheur; Olivier Aubert; Alexandre Loupy

doi:10.1016/j.kint.2022.12.011

Machine learning does not outperform traditional statistical modelling for kidney allograft failure prediction

Agathe Truchot, Marc Raynaud, Nassim Kamar, Maarten Naesens, Christophe Legendre, Michel Delahousse, Olivier Thaunat, Matthias Buchler, Marta Crespo, Kamilla Linhares, Babak J. Orandi, Enver Akalin, Gervacio Soler Pujol, Helio Tedesco Silva, Gaurav Gupta, Dorry L. Segev, Xavier Jouven, Andrew J. Bentall, Mark D. Stegall, Carmen LefaucheurOlivier Aubert, Alexandre Loupy

Medicine

Research output: Contribution to journal › Article › peer-review

5 Scopus citations

Abstract

Machine learning (ML) models have recently shown potential for predicting kidney allograft outcomes. However, their ability to outperform traditional approaches remains poorly investigated. Therefore, using large cohorts of kidney transplant recipients from 14 centers worldwide, we developed ML-based prediction models for kidney allograft survival and compared their prediction performances to those achieved by a validated Cox-Based Prognostication System (CBPS). In a French derivation cohort of 4000 patients, candidate determinants of allograft failure including donor, recipient and transplant-related parameters were used as predictors to develop tree-based models (RSF, RSF-ERT, CIF), Support Vector Machine models (LK-SVM, AK-SVM) and a gradient boosting model (XGBoost). Models were externally validated with cohorts of 2214 patients from Europe, 1537 from North America, and 671 from South America. Among these 8422 kidney transplant recipients, 1081 (12.84%) lost their grafts after a median post-transplant follow-up time of 6.25 years (Inter Quartile Range 4.33-8.73). At seven years post-risk evaluation, the ML models achieved a C-index of 0.788 (95% bootstrap percentile confidence interval 0.736-0.833), 0.779 (0.724-0.825), 0.786 (0.735-0.832), 0.527 (0.456-0.602), 0.704 (0.648-0.759) and 0.767 (0.711-0.815) for RSF, RSF-ERT, CIF, LK-SVM, AK-SVM and XGBoost respectively, compared with 0.808 (0.792-0.829) for the CBPS. In validation cohorts, ML models’ discrimination performances were in a similar range of those of the CBPS. Calibrations of the ML models were similar or less accurate than those of the CBPS. Thus, when using a transparent methodological pipeline in validated international cohorts, ML models, despite overall good performances, do not outperform a traditional CBPS in predicting kidney allograft failure. Hence, our current study supports the continued use of traditional statistical approaches for kidney graft prognostication.

Original language	English (US)
Pages (from-to)	936-948
Number of pages	13
Journal	Kidney international
Volume	103
Issue number	5
DOIs	https://doi.org/10.1016/j.kint.2022.12.011
State	Published - May 2023

Keywords

artificial intelligence
prediction
transplantation

ASJC Scopus subject areas

Nephrology

Access to Document

10.1016/j.kint.2022.12.011

Cite this

Truchot, A., Raynaud, M., Kamar, N., Naesens, M., Legendre, C., Delahousse, M., Thaunat, O., Buchler, M., Crespo, M., Linhares, K., Orandi, B. J., Akalin, E., Pujol, G. S., Silva, H. T., Gupta, G., Segev, D. L., Jouven, X., Bentall, A. J., Stegall, M. D., ... Loupy, A. (2023). Machine learning does not outperform traditional statistical modelling for kidney allograft failure prediction. Kidney international, 103(5), 936-948. https://doi.org/10.1016/j.kint.2022.12.011

Truchot, A, Raynaud, M, Kamar, N, Naesens, M, Legendre, C, Delahousse, M, Thaunat, O, Buchler, M, Crespo, M, Linhares, K, Orandi, BJ, Akalin, E, Pujol, GS, Silva, HT, Gupta, G, Segev, DL, Jouven, X, Bentall, AJ, Stegall, MD, Lefaucheur, C, Aubert, O & Loupy, A 2023, 'Machine learning does not outperform traditional statistical modelling for kidney allograft failure prediction', Kidney international, vol. 103, no. 5, pp. 936-948. https://doi.org/10.1016/j.kint.2022.12.011

@article{71c2bead1a3e43faaf9bff834ee7323c,

title = "Machine learning does not outperform traditional statistical modelling for kidney allograft failure prediction",

abstract = "Machine learning (ML) models have recently shown potential for predicting kidney allograft outcomes. However, their ability to outperform traditional approaches remains poorly investigated. Therefore, using large cohorts of kidney transplant recipients from 14 centers worldwide, we developed ML-based prediction models for kidney allograft survival and compared their prediction performances to those achieved by a validated Cox-Based Prognostication System (CBPS). In a French derivation cohort of 4000 patients, candidate determinants of allograft failure including donor, recipient and transplant-related parameters were used as predictors to develop tree-based models (RSF, RSF-ERT, CIF), Support Vector Machine models (LK-SVM, AK-SVM) and a gradient boosting model (XGBoost). Models were externally validated with cohorts of 2214 patients from Europe, 1537 from North America, and 671 from South America. Among these 8422 kidney transplant recipients, 1081 (12.84%) lost their grafts after a median post-transplant follow-up time of 6.25 years (Inter Quartile Range 4.33-8.73). At seven years post-risk evaluation, the ML models achieved a C-index of 0.788 (95% bootstrap percentile confidence interval 0.736-0.833), 0.779 (0.724-0.825), 0.786 (0.735-0.832), 0.527 (0.456-0.602), 0.704 (0.648-0.759) and 0.767 (0.711-0.815) for RSF, RSF-ERT, CIF, LK-SVM, AK-SVM and XGBoost respectively, compared with 0.808 (0.792-0.829) for the CBPS. In validation cohorts, ML models{\textquoteright} discrimination performances were in a similar range of those of the CBPS. Calibrations of the ML models were similar or less accurate than those of the CBPS. Thus, when using a transparent methodological pipeline in validated international cohorts, ML models, despite overall good performances, do not outperform a traditional CBPS in predicting kidney allograft failure. Hence, our current study supports the continued use of traditional statistical approaches for kidney graft prognostication.",

keywords = "artificial intelligence, prediction, transplantation",

author = "Agathe Truchot and Marc Raynaud and Nassim Kamar and Maarten Naesens and Christophe Legendre and Michel Delahousse and Olivier Thaunat and Matthias Buchler and Marta Crespo and Kamilla Linhares and Orandi, {Babak J.} and Enver Akalin and Pujol, {Gervacio Soler} and Silva, {Helio Tedesco} and Gaurav Gupta and Segev, {Dorry L.} and Xavier Jouven and Bentall, {Andrew J.} and Stegall, {Mark D.} and Carmen Lefaucheur and Olivier Aubert and Alexandre Loupy",

note = "Publisher Copyright: {\textcopyright} 2022 International Society of Nephrology",

year = "2023",

month = may,

doi = "10.1016/j.kint.2022.12.011",

language = "English (US)",

volume = "103",

pages = "936--948",

journal = "Kidney international",

issn = "0085-2538",

publisher = "Nature Publishing Group",

number = "5",

}

TY - JOUR

T1 - Machine learning does not outperform traditional statistical modelling for kidney allograft failure prediction

AU - Truchot, Agathe

AU - Raynaud, Marc

AU - Kamar, Nassim

AU - Naesens, Maarten

AU - Legendre, Christophe

AU - Delahousse, Michel

AU - Thaunat, Olivier

AU - Buchler, Matthias

AU - Crespo, Marta

AU - Linhares, Kamilla

AU - Orandi, Babak J.

AU - Akalin, Enver

AU - Pujol, Gervacio Soler

AU - Silva, Helio Tedesco

AU - Gupta, Gaurav

AU - Segev, Dorry L.

AU - Jouven, Xavier

AU - Bentall, Andrew J.

AU - Stegall, Mark D.

AU - Lefaucheur, Carmen

AU - Aubert, Olivier

AU - Loupy, Alexandre

PY - 2023/5

Y1 - 2023/5

N2 - Machine learning (ML) models have recently shown potential for predicting kidney allograft outcomes. However, their ability to outperform traditional approaches remains poorly investigated. Therefore, using large cohorts of kidney transplant recipients from 14 centers worldwide, we developed ML-based prediction models for kidney allograft survival and compared their prediction performances to those achieved by a validated Cox-Based Prognostication System (CBPS). In a French derivation cohort of 4000 patients, candidate determinants of allograft failure including donor, recipient and transplant-related parameters were used as predictors to develop tree-based models (RSF, RSF-ERT, CIF), Support Vector Machine models (LK-SVM, AK-SVM) and a gradient boosting model (XGBoost). Models were externally validated with cohorts of 2214 patients from Europe, 1537 from North America, and 671 from South America. Among these 8422 kidney transplant recipients, 1081 (12.84%) lost their grafts after a median post-transplant follow-up time of 6.25 years (Inter Quartile Range 4.33-8.73). At seven years post-risk evaluation, the ML models achieved a C-index of 0.788 (95% bootstrap percentile confidence interval 0.736-0.833), 0.779 (0.724-0.825), 0.786 (0.735-0.832), 0.527 (0.456-0.602), 0.704 (0.648-0.759) and 0.767 (0.711-0.815) for RSF, RSF-ERT, CIF, LK-SVM, AK-SVM and XGBoost respectively, compared with 0.808 (0.792-0.829) for the CBPS. In validation cohorts, ML models’ discrimination performances were in a similar range of those of the CBPS. Calibrations of the ML models were similar or less accurate than those of the CBPS. Thus, when using a transparent methodological pipeline in validated international cohorts, ML models, despite overall good performances, do not outperform a traditional CBPS in predicting kidney allograft failure. Hence, our current study supports the continued use of traditional statistical approaches for kidney graft prognostication.

AB - Machine learning (ML) models have recently shown potential for predicting kidney allograft outcomes. However, their ability to outperform traditional approaches remains poorly investigated. Therefore, using large cohorts of kidney transplant recipients from 14 centers worldwide, we developed ML-based prediction models for kidney allograft survival and compared their prediction performances to those achieved by a validated Cox-Based Prognostication System (CBPS). In a French derivation cohort of 4000 patients, candidate determinants of allograft failure including donor, recipient and transplant-related parameters were used as predictors to develop tree-based models (RSF, RSF-ERT, CIF), Support Vector Machine models (LK-SVM, AK-SVM) and a gradient boosting model (XGBoost). Models were externally validated with cohorts of 2214 patients from Europe, 1537 from North America, and 671 from South America. Among these 8422 kidney transplant recipients, 1081 (12.84%) lost their grafts after a median post-transplant follow-up time of 6.25 years (Inter Quartile Range 4.33-8.73). At seven years post-risk evaluation, the ML models achieved a C-index of 0.788 (95% bootstrap percentile confidence interval 0.736-0.833), 0.779 (0.724-0.825), 0.786 (0.735-0.832), 0.527 (0.456-0.602), 0.704 (0.648-0.759) and 0.767 (0.711-0.815) for RSF, RSF-ERT, CIF, LK-SVM, AK-SVM and XGBoost respectively, compared with 0.808 (0.792-0.829) for the CBPS. In validation cohorts, ML models’ discrimination performances were in a similar range of those of the CBPS. Calibrations of the ML models were similar or less accurate than those of the CBPS. Thus, when using a transparent methodological pipeline in validated international cohorts, ML models, despite overall good performances, do not outperform a traditional CBPS in predicting kidney allograft failure. Hence, our current study supports the continued use of traditional statistical approaches for kidney graft prognostication.

KW - artificial intelligence

KW - prediction

KW - transplantation

UR - http://www.scopus.com/inward/record.url?scp=85146460351&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85146460351&partnerID=8YFLogxK

U2 - 10.1016/j.kint.2022.12.011

DO - 10.1016/j.kint.2022.12.011

M3 - Article

C2 - 36572246

AN - SCOPUS:85146460351

SN - 0085-2538

VL - 103

SP - 936

EP - 948

JO - Kidney international

JF - Kidney international

IS - 5

ER -

Machine learning does not outperform traditional statistical modelling for kidney allograft failure prediction

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this