Quality assessment of machine learning models for diagnostic imaging in orthopaedics: A systematic review

Amanda Lans; Robertus J.B. Pierik; John R. Bales; Mitchell S. Fourman; David Shin; Laura N. Kanbier; Jack Rifkin; William H. DiGiovanni; Rohan R. Chopra; Rana Moeinzad; Jorrit Jan Verlaan; Joseph H. Schwab

doi:10.1016/j.artmed.2022.102396

Quality assessment of machine learning models for diagnostic imaging in orthopaedics: A systematic review

Amanda Lans, Robertus J.B. Pierik, John R. Bales, Mitchell S. Fourman, David Shin, Laura N. Kanbier, Jack Rifkin, William H. DiGiovanni, Rohan R. Chopra, Rana Moeinzad, Jorrit Jan Verlaan, Joseph H. Schwab

Research output: Contribution to journal › Review article › peer-review

8 Scopus citations

Abstract

Background: Machine learning (ML) models are emerging at a rapid pace in orthopaedic imaging due to their ability to facilitate timely diagnostic and treatment decision making. However, despite a considerable increase in model development and ML-related publications, there has been little evaluation regarding the quality of these studies. In order to successfully move forward with the implementation of ML models for diagnostic imaging in orthopaedics, it is imperative that we ensure models are held at a high standard and provide applicable, reliable and accurate results. Multiple reporting guidelines have been developed to help authors and reviewers of ML models, such as the Checklist for AI in Medical Imaging (CLAIM) and the Quality Assessment of Diagnostic Accuracy Studies (QUADAS-2) tool. Previous investigations of prognostic orthopaedic ML models have reported concerns with regard to the rate of transparent reporting. Therefore, an assessment of whether ML models for diagnostic imaging in orthopaedics adequately and clearly report essential facets of their model development is warranted. Purposes: To evaluate (1) the completeness of the CLAIM checklist and (2) the risk of bias according to the QUADAS-2 tool for ML-based orthopaedic diagnostic imaging models. This study sought to identify ML details that researchers commonly fail to report and to provide recommendations to improve reporting standards for diagnostic imaging ML models. Methods: A systematic review was performed to identify ML-based diagnostic imaging models in orthopaedic surgery. Articles published within the last 5 years were included. Two reviewers independently extracted data using the CLAIM checklist and QUADAS-2 tool, and discrepancies were resolved by discussion with at least two additional reviewers. Results: After screening 7507 articles, 91 met the study criteria. The mean completeness of CLAIM items was 63 % (SD ± 28 %). Among the worst reported CLAIM items were item 28 (metrics of model performance), item 13 (the handling of missing data) and item 9 (data preprocessing steps), with only 2 % (2/91), 8 % (7/91) and 13 % (12/91) of studies correctly reporting these items, respectively. The QUADAS-2 tool revealed that the patient selection domain was at the highest risk of bias: 18 % (16/91) of studies were at high risk of bias and 32 % (29/91) had an unknown risk of bias. Conclusions: This review demonstrates that the reporting of relevant information, such as handling missing data and data preprocessing steps, by diagnostic ML studies for orthopaedic imaging studies is limited. Additionally, a substantial number of works were at high risk of bias. Future studies describing ML-based models for diagnostic imaging should adhere to acknowledged methodological standards to maximize the quality and applicability of their models.

Original language	English (US)
Article number	102396
Journal	Artificial Intelligence in Medicine
Volume	132
DOIs	https://doi.org/10.1016/j.artmed.2022.102396
State	Published - Oct 2022
Externally published	Yes

Keywords

Artificial intelligence
Machine learning
Medical imaging
Orthopaedics

ASJC Scopus subject areas

Medicine (miscellaneous)
Artificial Intelligence

Access to Document

10.1016/j.artmed.2022.102396

Cite this

Lans, A., Pierik, R. J. B., Bales, J. R., Fourman, M. S., Shin, D., Kanbier, L. N., Rifkin, J., DiGiovanni, W. H., Chopra, R. R., Moeinzad, R., Verlaan, J. J., & Schwab, J. H. (2022). Quality assessment of machine learning models for diagnostic imaging in orthopaedics: A systematic review. Artificial Intelligence in Medicine, 132, Article 102396. https://doi.org/10.1016/j.artmed.2022.102396

@article{f23fc919e55344a0a4d9028ec69d006d,

title = "Quality assessment of machine learning models for diagnostic imaging in orthopaedics: A systematic review",

abstract = "Background: Machine learning (ML) models are emerging at a rapid pace in orthopaedic imaging due to their ability to facilitate timely diagnostic and treatment decision making. However, despite a considerable increase in model development and ML-related publications, there has been little evaluation regarding the quality of these studies. In order to successfully move forward with the implementation of ML models for diagnostic imaging in orthopaedics, it is imperative that we ensure models are held at a high standard and provide applicable, reliable and accurate results. Multiple reporting guidelines have been developed to help authors and reviewers of ML models, such as the Checklist for AI in Medical Imaging (CLAIM) and the Quality Assessment of Diagnostic Accuracy Studies (QUADAS-2) tool. Previous investigations of prognostic orthopaedic ML models have reported concerns with regard to the rate of transparent reporting. Therefore, an assessment of whether ML models for diagnostic imaging in orthopaedics adequately and clearly report essential facets of their model development is warranted. Purposes: To evaluate (1) the completeness of the CLAIM checklist and (2) the risk of bias according to the QUADAS-2 tool for ML-based orthopaedic diagnostic imaging models. This study sought to identify ML details that researchers commonly fail to report and to provide recommendations to improve reporting standards for diagnostic imaging ML models. Methods: A systematic review was performed to identify ML-based diagnostic imaging models in orthopaedic surgery. Articles published within the last 5 years were included. Two reviewers independently extracted data using the CLAIM checklist and QUADAS-2 tool, and discrepancies were resolved by discussion with at least two additional reviewers. Results: After screening 7507 articles, 91 met the study criteria. The mean completeness of CLAIM items was 63 % (SD ± 28 %). Among the worst reported CLAIM items were item 28 (metrics of model performance), item 13 (the handling of missing data) and item 9 (data preprocessing steps), with only 2 % (2/91), 8 % (7/91) and 13 % (12/91) of studies correctly reporting these items, respectively. The QUADAS-2 tool revealed that the patient selection domain was at the highest risk of bias: 18 % (16/91) of studies were at high risk of bias and 32 % (29/91) had an unknown risk of bias. Conclusions: This review demonstrates that the reporting of relevant information, such as handling missing data and data preprocessing steps, by diagnostic ML studies for orthopaedic imaging studies is limited. Additionally, a substantial number of works were at high risk of bias. Future studies describing ML-based models for diagnostic imaging should adhere to acknowledged methodological standards to maximize the quality and applicability of their models.",

keywords = "Artificial intelligence, Machine learning, Medical imaging, Orthopaedics",

author = "Amanda Lans and Pierik, {Robertus J.B.} and Bales, {John R.} and Fourman, {Mitchell S.} and David Shin and Kanbier, {Laura N.} and Jack Rifkin and DiGiovanni, {William H.} and Chopra, {Rohan R.} and Rana Moeinzad and Verlaan, {Jorrit Jan} and Schwab, {Joseph H.}",

note = "Publisher Copyright: {\textcopyright} 2022 Elsevier B.V.",

year = "2022",

month = oct,

doi = "10.1016/j.artmed.2022.102396",

language = "English (US)",

volume = "132",

journal = "Artificial Intelligence in Medicine",

issn = "0933-3657",

publisher = "Elsevier",

}

TY - JOUR

T1 - Quality assessment of machine learning models for diagnostic imaging in orthopaedics

T2 - A systematic review

AU - Lans, Amanda

AU - Pierik, Robertus J.B.

AU - Bales, John R.

AU - Fourman, Mitchell S.

AU - Shin, David

AU - Kanbier, Laura N.

AU - Rifkin, Jack

AU - DiGiovanni, William H.

AU - Chopra, Rohan R.

AU - Moeinzad, Rana

AU - Verlaan, Jorrit Jan

AU - Schwab, Joseph H.

PY - 2022/10

Y1 - 2022/10

N2 - Background: Machine learning (ML) models are emerging at a rapid pace in orthopaedic imaging due to their ability to facilitate timely diagnostic and treatment decision making. However, despite a considerable increase in model development and ML-related publications, there has been little evaluation regarding the quality of these studies. In order to successfully move forward with the implementation of ML models for diagnostic imaging in orthopaedics, it is imperative that we ensure models are held at a high standard and provide applicable, reliable and accurate results. Multiple reporting guidelines have been developed to help authors and reviewers of ML models, such as the Checklist for AI in Medical Imaging (CLAIM) and the Quality Assessment of Diagnostic Accuracy Studies (QUADAS-2) tool. Previous investigations of prognostic orthopaedic ML models have reported concerns with regard to the rate of transparent reporting. Therefore, an assessment of whether ML models for diagnostic imaging in orthopaedics adequately and clearly report essential facets of their model development is warranted. Purposes: To evaluate (1) the completeness of the CLAIM checklist and (2) the risk of bias according to the QUADAS-2 tool for ML-based orthopaedic diagnostic imaging models. This study sought to identify ML details that researchers commonly fail to report and to provide recommendations to improve reporting standards for diagnostic imaging ML models. Methods: A systematic review was performed to identify ML-based diagnostic imaging models in orthopaedic surgery. Articles published within the last 5 years were included. Two reviewers independently extracted data using the CLAIM checklist and QUADAS-2 tool, and discrepancies were resolved by discussion with at least two additional reviewers. Results: After screening 7507 articles, 91 met the study criteria. The mean completeness of CLAIM items was 63 % (SD ± 28 %). Among the worst reported CLAIM items were item 28 (metrics of model performance), item 13 (the handling of missing data) and item 9 (data preprocessing steps), with only 2 % (2/91), 8 % (7/91) and 13 % (12/91) of studies correctly reporting these items, respectively. The QUADAS-2 tool revealed that the patient selection domain was at the highest risk of bias: 18 % (16/91) of studies were at high risk of bias and 32 % (29/91) had an unknown risk of bias. Conclusions: This review demonstrates that the reporting of relevant information, such as handling missing data and data preprocessing steps, by diagnostic ML studies for orthopaedic imaging studies is limited. Additionally, a substantial number of works were at high risk of bias. Future studies describing ML-based models for diagnostic imaging should adhere to acknowledged methodological standards to maximize the quality and applicability of their models.

AB - Background: Machine learning (ML) models are emerging at a rapid pace in orthopaedic imaging due to their ability to facilitate timely diagnostic and treatment decision making. However, despite a considerable increase in model development and ML-related publications, there has been little evaluation regarding the quality of these studies. In order to successfully move forward with the implementation of ML models for diagnostic imaging in orthopaedics, it is imperative that we ensure models are held at a high standard and provide applicable, reliable and accurate results. Multiple reporting guidelines have been developed to help authors and reviewers of ML models, such as the Checklist for AI in Medical Imaging (CLAIM) and the Quality Assessment of Diagnostic Accuracy Studies (QUADAS-2) tool. Previous investigations of prognostic orthopaedic ML models have reported concerns with regard to the rate of transparent reporting. Therefore, an assessment of whether ML models for diagnostic imaging in orthopaedics adequately and clearly report essential facets of their model development is warranted. Purposes: To evaluate (1) the completeness of the CLAIM checklist and (2) the risk of bias according to the QUADAS-2 tool for ML-based orthopaedic diagnostic imaging models. This study sought to identify ML details that researchers commonly fail to report and to provide recommendations to improve reporting standards for diagnostic imaging ML models. Methods: A systematic review was performed to identify ML-based diagnostic imaging models in orthopaedic surgery. Articles published within the last 5 years were included. Two reviewers independently extracted data using the CLAIM checklist and QUADAS-2 tool, and discrepancies were resolved by discussion with at least two additional reviewers. Results: After screening 7507 articles, 91 met the study criteria. The mean completeness of CLAIM items was 63 % (SD ± 28 %). Among the worst reported CLAIM items were item 28 (metrics of model performance), item 13 (the handling of missing data) and item 9 (data preprocessing steps), with only 2 % (2/91), 8 % (7/91) and 13 % (12/91) of studies correctly reporting these items, respectively. The QUADAS-2 tool revealed that the patient selection domain was at the highest risk of bias: 18 % (16/91) of studies were at high risk of bias and 32 % (29/91) had an unknown risk of bias. Conclusions: This review demonstrates that the reporting of relevant information, such as handling missing data and data preprocessing steps, by diagnostic ML studies for orthopaedic imaging studies is limited. Additionally, a substantial number of works were at high risk of bias. Future studies describing ML-based models for diagnostic imaging should adhere to acknowledged methodological standards to maximize the quality and applicability of their models.

KW - Artificial intelligence

KW - Machine learning

KW - Medical imaging

KW - Orthopaedics

UR - http://www.scopus.com/inward/record.url?scp=85138782943&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85138782943&partnerID=8YFLogxK

U2 - 10.1016/j.artmed.2022.102396

DO - 10.1016/j.artmed.2022.102396

M3 - Review article

C2 - 36207080

AN - SCOPUS:85138782943

SN - 0933-3657

VL - 132

JO - Artificial Intelligence in Medicine

JF - Artificial Intelligence in Medicine

M1 - 102396

ER -

Quality assessment of machine learning models for diagnostic imaging in orthopaedics: A systematic review

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this