A Deep Learning Model to Triage Screening Mammograms: A Simulation Study

Abstract

In this retrospective study, 223 109 consecutive screening mammograms performed in 66 661 women from January 2009 to December 2016 were collected with cancer outcomes obtained through linkage to a regional tumor registry. This cohort was split by patient into 212 272, 25 999, and 26 540 mammograms from 56 831, 7021, and 7176 patients for training, validation, and testing, respectively. A DL model was developed to triage mammograms as cancer free and evaluated on the test set. A DL-triage workflow was simulated in which radiologists skipped mammograms triaged as cancer free (interpreting them as negative for cancer) and read mammograms not triaged as cancer free by using the original interpreting radiologists’ assessments. Sensitivities, specificities, and percentage of mammograms read were calculated, with and without the DL-triage–simulated workflow. Statistics were computed across 5000 bootstrap samples to assess confidence intervals (CIs). Specificities were compared by using a two-tailed t test (P < .05) and sensitivities were compared by using a one-sided t test with a noninferiority margin of 5% (P < .05).

Publication
In Radiological Society of North America