TY - JOUR
T1 - Development of an automated, general-purpose prediction tool for postoperative respiratory failure using machine learning
T2 - A retrospective cohort study
AU - Kiyatkin, Michael E.
AU - Aasman, Boudewijn
AU - Fazzari, Melissa J.
AU - Rudolph, Maíra I.
AU - Vidal Melo, Marcos F.
AU - Eikermann, Matthias
AU - Gong, Michelle N.
N1 - Publisher Copyright:
© 2023 Elsevier Inc.
PY - 2023/11
Y1 - 2023/11
N2 - Study objective: Postoperative respiratory failure is a major surgical complication and key quality metric. Existing prediction tools underperform, are limited to specific populations, and necessitate manual calculation. This limits their implementation. We aimed to create an improved, machine learning powered prediction tool with ideal characteristics for automated calculation. Design, setting, and patients: We retrospectively reviewed 101,455 anesthetic procedures from 1/2018 to 6/2021. The primary outcome was the Standardized Endpoints in Perioperative Medicine consensus definition for postoperative respiratory failure. Secondary outcomes were respiratory quality metrics from the National Surgery Quality Improvement Sample, Society of Thoracic Surgeons, and CMS. We abstracted from the electronic health record 26 procedural and physiologic variables previously identified as respiratory failure risk factors. We randomly split the cohort and used the Random Forest method to predict the composite outcome in the training cohort. We coined this the RESPIRE model and measured its accuracy in the validation cohort using area under the receiver operating curve (AUROC) analysis, among other measures, and compared this with ARISCAT and SPORC-1, two leading prediction tools. We compared performance in a validation cohort using score cut-offs determined in a separate test cohort. Main results: The RESPIRE model exhibited superior accuracy with an AUROC of 0.93 (95% CI, 0.92–0.95) compared to 0.82 for both ARISCAT and SPORC-1 (P-for-difference < 0.0001 for both). At comparable 80–90% sensitivities, RESPIRE had higher positive predictive value (11%, 95% CI: 10–12%) and lower false positive rate (12%, 95% CI: 12–13%) compared to 4% and 37% for both ARISCAT and SPORC-1. The RESPIRE model also better predicted the established quality metrics for postoperative respiratory failure. Conclusions: We developed a general-purpose, machine learning powered prediction tool with superior performance for research and quality-based definitions of postoperative respiratory failure.
AB - Study objective: Postoperative respiratory failure is a major surgical complication and key quality metric. Existing prediction tools underperform, are limited to specific populations, and necessitate manual calculation. This limits their implementation. We aimed to create an improved, machine learning powered prediction tool with ideal characteristics for automated calculation. Design, setting, and patients: We retrospectively reviewed 101,455 anesthetic procedures from 1/2018 to 6/2021. The primary outcome was the Standardized Endpoints in Perioperative Medicine consensus definition for postoperative respiratory failure. Secondary outcomes were respiratory quality metrics from the National Surgery Quality Improvement Sample, Society of Thoracic Surgeons, and CMS. We abstracted from the electronic health record 26 procedural and physiologic variables previously identified as respiratory failure risk factors. We randomly split the cohort and used the Random Forest method to predict the composite outcome in the training cohort. We coined this the RESPIRE model and measured its accuracy in the validation cohort using area under the receiver operating curve (AUROC) analysis, among other measures, and compared this with ARISCAT and SPORC-1, two leading prediction tools. We compared performance in a validation cohort using score cut-offs determined in a separate test cohort. Main results: The RESPIRE model exhibited superior accuracy with an AUROC of 0.93 (95% CI, 0.92–0.95) compared to 0.82 for both ARISCAT and SPORC-1 (P-for-difference < 0.0001 for both). At comparable 80–90% sensitivities, RESPIRE had higher positive predictive value (11%, 95% CI: 10–12%) and lower false positive rate (12%, 95% CI: 12–13%) compared to 4% and 37% for both ARISCAT and SPORC-1. The RESPIRE model also better predicted the established quality metrics for postoperative respiratory failure. Conclusions: We developed a general-purpose, machine learning powered prediction tool with superior performance for research and quality-based definitions of postoperative respiratory failure.
KW - Machine learning
KW - Perioperative medicine
KW - Postoperative respiratory failure
KW - Preoperative prediction
UR - http://www.scopus.com/inward/record.url?scp=85165245837&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85165245837&partnerID=8YFLogxK
U2 - 10.1016/j.jclinane.2023.111194
DO - 10.1016/j.jclinane.2023.111194
M3 - Article
C2 - 37422982
AN - SCOPUS:85165245837
SN - 0952-8180
VL - 90
JO - Journal of Clinical Anesthesia
JF - Journal of Clinical Anesthesia
M1 - 111194
ER -