Development of an automated, general-purpose prediction tool for postoperative respiratory failure using machine learning: A retrospective cohort study

Michael E. Kiyatkin; Boudewijn Aasman; Melissa J. Fazzari; Maíra I. Rudolph; Marcos F. Vidal Melo; Matthias Eikermann; Michelle N. Gong

doi:10.1016/j.jclinane.2023.111194

Development of an automated, general-purpose prediction tool for postoperative respiratory failure using machine learning: A retrospective cohort study

Michael E. Kiyatkin, Boudewijn Aasman, Melissa J. Fazzari, Maíra I. Rudolph, Marcos F. Vidal Melo, Matthias Eikermann, Michelle N. Gong

Research output: Contribution to journal › Article › peer-review

Abstract

Study objective: Postoperative respiratory failure is a major surgical complication and key quality metric. Existing prediction tools underperform, are limited to specific populations, and necessitate manual calculation. This limits their implementation. We aimed to create an improved, machine learning powered prediction tool with ideal characteristics for automated calculation. Design, setting, and patients: We retrospectively reviewed 101,455 anesthetic procedures from 1/2018 to 6/2021. The primary outcome was the Standardized Endpoints in Perioperative Medicine consensus definition for postoperative respiratory failure. Secondary outcomes were respiratory quality metrics from the National Surgery Quality Improvement Sample, Society of Thoracic Surgeons, and CMS. We abstracted from the electronic health record 26 procedural and physiologic variables previously identified as respiratory failure risk factors. We randomly split the cohort and used the Random Forest method to predict the composite outcome in the training cohort. We coined this the RESPIRE model and measured its accuracy in the validation cohort using area under the receiver operating curve (AUROC) analysis, among other measures, and compared this with ARISCAT and SPORC-1, two leading prediction tools. We compared performance in a validation cohort using score cut-offs determined in a separate test cohort. Main results: The RESPIRE model exhibited superior accuracy with an AUROC of 0.93 (95% CI, 0.92–0.95) compared to 0.82 for both ARISCAT and SPORC-1 (P-for-difference < 0.0001 for both). At comparable 80–90% sensitivities, RESPIRE had higher positive predictive value (11%, 95% CI: 10–12%) and lower false positive rate (12%, 95% CI: 12–13%) compared to 4% and 37% for both ARISCAT and SPORC-1. The RESPIRE model also better predicted the established quality metrics for postoperative respiratory failure. Conclusions: We developed a general-purpose, machine learning powered prediction tool with superior performance for research and quality-based definitions of postoperative respiratory failure.

Original language	English (US)
Article number	111194
Journal	Journal of Clinical Anesthesia
Volume	90
DOIs	https://doi.org/10.1016/j.jclinane.2023.111194
State	Published - Nov 2023

Keywords

Machine learning
Perioperative medicine
Postoperative respiratory failure
Preoperative prediction

ASJC Scopus subject areas

Anesthesiology and Pain Medicine

Access to Document

10.1016/j.jclinane.2023.111194

Cite this

@article{c74ff3e2e0154eab9bc130c46c600f08,

title = "Development of an automated, general-purpose prediction tool for postoperative respiratory failure using machine learning: A retrospective cohort study",

abstract = "Study objective: Postoperative respiratory failure is a major surgical complication and key quality metric. Existing prediction tools underperform, are limited to specific populations, and necessitate manual calculation. This limits their implementation. We aimed to create an improved, machine learning powered prediction tool with ideal characteristics for automated calculation. Design, setting, and patients: We retrospectively reviewed 101,455 anesthetic procedures from 1/2018 to 6/2021. The primary outcome was the Standardized Endpoints in Perioperative Medicine consensus definition for postoperative respiratory failure. Secondary outcomes were respiratory quality metrics from the National Surgery Quality Improvement Sample, Society of Thoracic Surgeons, and CMS. We abstracted from the electronic health record 26 procedural and physiologic variables previously identified as respiratory failure risk factors. We randomly split the cohort and used the Random Forest method to predict the composite outcome in the training cohort. We coined this the RESPIRE model and measured its accuracy in the validation cohort using area under the receiver operating curve (AUROC) analysis, among other measures, and compared this with ARISCAT and SPORC-1, two leading prediction tools. We compared performance in a validation cohort using score cut-offs determined in a separate test cohort. Main results: The RESPIRE model exhibited superior accuracy with an AUROC of 0.93 (95% CI, 0.92–0.95) compared to 0.82 for both ARISCAT and SPORC-1 (P-for-difference < 0.0001 for both). At comparable 80–90% sensitivities, RESPIRE had higher positive predictive value (11%, 95% CI: 10–12%) and lower false positive rate (12%, 95% CI: 12–13%) compared to 4% and 37% for both ARISCAT and SPORC-1. The RESPIRE model also better predicted the established quality metrics for postoperative respiratory failure. Conclusions: We developed a general-purpose, machine learning powered prediction tool with superior performance for research and quality-based definitions of postoperative respiratory failure.",

keywords = "Machine learning, Perioperative medicine, Postoperative respiratory failure, Preoperative prediction",

author = "Kiyatkin, {Michael E.} and Boudewijn Aasman and Fazzari, {Melissa J.} and Rudolph, {Ma{\'i}ra I.} and {Vidal Melo}, {Marcos F.} and Matthias Eikermann and Gong, {Michelle N.}",

note = "Publisher Copyright: {\textcopyright} 2023 Elsevier Inc.",

year = "2023",

month = nov,

doi = "10.1016/j.jclinane.2023.111194",

language = "English (US)",

volume = "90",

journal = "Journal of Clinical Anesthesia",

issn = "0952-8180",

publisher = "Elsevier Inc.",

}

TY - JOUR

T1 - Development of an automated, general-purpose prediction tool for postoperative respiratory failure using machine learning

T2 - A retrospective cohort study

AU - Kiyatkin, Michael E.

AU - Aasman, Boudewijn

AU - Fazzari, Melissa J.

AU - Rudolph, Maíra I.

AU - Vidal Melo, Marcos F.

AU - Eikermann, Matthias

AU - Gong, Michelle N.

PY - 2023/11

Y1 - 2023/11

N2 - Study objective: Postoperative respiratory failure is a major surgical complication and key quality metric. Existing prediction tools underperform, are limited to specific populations, and necessitate manual calculation. This limits their implementation. We aimed to create an improved, machine learning powered prediction tool with ideal characteristics for automated calculation. Design, setting, and patients: We retrospectively reviewed 101,455 anesthetic procedures from 1/2018 to 6/2021. The primary outcome was the Standardized Endpoints in Perioperative Medicine consensus definition for postoperative respiratory failure. Secondary outcomes were respiratory quality metrics from the National Surgery Quality Improvement Sample, Society of Thoracic Surgeons, and CMS. We abstracted from the electronic health record 26 procedural and physiologic variables previously identified as respiratory failure risk factors. We randomly split the cohort and used the Random Forest method to predict the composite outcome in the training cohort. We coined this the RESPIRE model and measured its accuracy in the validation cohort using area under the receiver operating curve (AUROC) analysis, among other measures, and compared this with ARISCAT and SPORC-1, two leading prediction tools. We compared performance in a validation cohort using score cut-offs determined in a separate test cohort. Main results: The RESPIRE model exhibited superior accuracy with an AUROC of 0.93 (95% CI, 0.92–0.95) compared to 0.82 for both ARISCAT and SPORC-1 (P-for-difference < 0.0001 for both). At comparable 80–90% sensitivities, RESPIRE had higher positive predictive value (11%, 95% CI: 10–12%) and lower false positive rate (12%, 95% CI: 12–13%) compared to 4% and 37% for both ARISCAT and SPORC-1. The RESPIRE model also better predicted the established quality metrics for postoperative respiratory failure. Conclusions: We developed a general-purpose, machine learning powered prediction tool with superior performance for research and quality-based definitions of postoperative respiratory failure.

AB - Study objective: Postoperative respiratory failure is a major surgical complication and key quality metric. Existing prediction tools underperform, are limited to specific populations, and necessitate manual calculation. This limits their implementation. We aimed to create an improved, machine learning powered prediction tool with ideal characteristics for automated calculation. Design, setting, and patients: We retrospectively reviewed 101,455 anesthetic procedures from 1/2018 to 6/2021. The primary outcome was the Standardized Endpoints in Perioperative Medicine consensus definition for postoperative respiratory failure. Secondary outcomes were respiratory quality metrics from the National Surgery Quality Improvement Sample, Society of Thoracic Surgeons, and CMS. We abstracted from the electronic health record 26 procedural and physiologic variables previously identified as respiratory failure risk factors. We randomly split the cohort and used the Random Forest method to predict the composite outcome in the training cohort. We coined this the RESPIRE model and measured its accuracy in the validation cohort using area under the receiver operating curve (AUROC) analysis, among other measures, and compared this with ARISCAT and SPORC-1, two leading prediction tools. We compared performance in a validation cohort using score cut-offs determined in a separate test cohort. Main results: The RESPIRE model exhibited superior accuracy with an AUROC of 0.93 (95% CI, 0.92–0.95) compared to 0.82 for both ARISCAT and SPORC-1 (P-for-difference < 0.0001 for both). At comparable 80–90% sensitivities, RESPIRE had higher positive predictive value (11%, 95% CI: 10–12%) and lower false positive rate (12%, 95% CI: 12–13%) compared to 4% and 37% for both ARISCAT and SPORC-1. The RESPIRE model also better predicted the established quality metrics for postoperative respiratory failure. Conclusions: We developed a general-purpose, machine learning powered prediction tool with superior performance for research and quality-based definitions of postoperative respiratory failure.

KW - Machine learning

KW - Perioperative medicine

KW - Postoperative respiratory failure

KW - Preoperative prediction

UR - http://www.scopus.com/inward/record.url?scp=85165245837&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85165245837&partnerID=8YFLogxK

U2 - 10.1016/j.jclinane.2023.111194

DO - 10.1016/j.jclinane.2023.111194

M3 - Article

C2 - 37422982

AN - SCOPUS:85165245837

SN - 0952-8180

VL - 90

JO - Journal of Clinical Anesthesia

JF - Journal of Clinical Anesthesia

M1 - 111194

ER -

Development of an automated, general-purpose prediction tool for postoperative respiratory failure using machine learning: A retrospective cohort study

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this