Artificial Intelligence for Breast Cancer Diagnosis: Sensitive But Not Specific

Researchers at the University of Washington and the University of California at Los Angeles (UCLA) have developed a machine-learning system that confirms breast cancer diagnoses made by radiologists, a recent paper in JAMA Network Open reports.

Skilled diagnosticians differ in their interpretation of radiographic images of different forms of cancer. Concordance among physicians, in recent research outside the United States, is lowest for hormone receptor-positive human epidermal growth factor receptor 2/neu- positive cancers and highest for triple-negative breast cancers.

The implications of diagnostic ambiguity, however, are different for different populations of women. For instance, studies at California hospitals have found that women of Asian-Pacific heritage are more likely than other ethnic groups of women to be diagnosed with HER2+ breast cancer, but there are patterns of late diagnosis within groups of Asian-Pacific women. Some studies find no difference in survival outcomes between African American and white women with triple-negative breast cancer, but other studies suggest that African Americans women face a higher breast cancer mortality after accounting for ER/PR/HER2 subtypes. 

These differences in outcome do not suggest a deficiency in diagnosis, but they do suggest nuance in pathological process. Capturing nuance in the progress of various forms of cancer is a task for artificial intelligence.

Medical images of breast biopsies contain a great deal of complex data. Deep learning has the ability to integrate data from multimodal imaging of breast cancer, CT, PET, and MR, with digitized histological images. AI algorithms can be trained to capture and highlight chromatin alteration, lymphovascular invasion, microvessel density, mitotic figures, and neoangiogenesis. These tasks are time-consuming for radiologists but result in diagnostically meaningful information. With the tools of AI, radiologists become experts who collect and interpret imaging, morphological, molecular, and genetic information for a more accurate and actionable diagnosis.

Computerized neural networks are capable of capture of intricate relations between features of images that are typically invisible to the human eye. But the key element of interest to radiologists is that the algorithms of AI have to be trained.  Algorithms fit data to diagnoses. Algorithms can overfit data to diagnoses. In some situations, the more predictive an algorithm is for its training set, the less predictive it is for actual patients.

Data scientists commonly attempt to defeat overfitting with synthetic increases to the training set for the AI algorithm. Called “augmentation,” this practice provides the algorithm with as many informative images as possible. Images are rescaled. They are rotated. They are subjected to shear. These processes are no invariant for affine transformation; they do not necessarily keep points in the same line. Augmentation provides new training information. But it does not achieve clinical judgment that recognizes exceptions to its rules. Recognizing disease processes that do not fit the norms, such as those for breast cancer in minority women, is not yet a task for artificial intelligence.

AI algorithms favor sensitivity over specificity. They are biased to avoid false-negative findings, on the assumption that failure to diagnose cancer is carries far more devastating health outcomes than unnecessary biopsy or premature pharmaceutical intervention. AI may help ensure equitable expectations of treatment for women across racial groupings. 

Computerized neural networks will reduce radiologist workloads. They will alert radiologists to early signs of pathology easily missed by the naked eye. But the real cure for overfitting is clinical judgment. Radiology is not a profession that is soon to become automated. AI will work for radiologists rather than the other way around.