Today : Feb 02, 2025
Science
02 February 2025

Foundation Models Improve Histopathology Image Retrieval Accuracy

New evaluation highlights potential and challenges of AI models for medical image diagnostics.

Advances in digital pathology and artificial intelligence (AI) are transforming the field of histopathology, offering unprecedented efficiency and accuracy for disease diagnostics and treatment planning. Recent evaluations of foundation models for image retrieval reveal both promise and challenges, providing insights for future improvements.

Researchers set out to assess the performance of several foundation models for zero-shot whole-slide image (WSI) retrieval, with their evaluations centered on diagnostic slides from The Cancer Genome Atlas (TCGA). The TCGA offers comprehensive data, covering 23 organs and 117 cancer subtypes, creating fertile ground for testing advanced AI capabilities.

Their approach implemented Yottixel, a framework for WSI retrieval, utilizing patch-based embeddings across various foundation models, including Yottixel-DenseNet, Yottixel-UNI, Yottixel-Virchow, Yottixel-GigaPath, and GigaPath WSI. The retrieval performance was gauged using macro-averaged F1 scores for the top 1, top 3, and top 5 retrievals.

The study’s findings revealed varied retrieval accuracy: Yottixel-DenseNet secured the lowest performance at 27% ± 13% for top 5 retrievals, whereas more sophisticated models like Yottixel-UNI and Yottixel-GigaPath achieved scores of 42% ± 14% and 41% ± 13% respectively. This illuminates the potential and limitations inherent to foundation models when applied to histopathology image retrieval. They noted, "These results demonstrate the potential and limitations of foundation models for histopathology image retrieval."

The holistic advantages brought forth by digital pathology could see pathologists gain from enhanced diagnostic workflows, contingent upon improving semantic relationships between human expertise and machine-derived data. The semantic gap—the divide between human interpretation and machine processing—is noteworthy, especially considering the dramatic potential of foundation models to bridge this gap.

While many models demonstrated improvements over traditional methods, such as DenseNet, the overall retrieval accuracy remained below clinical standards. The average F1 score hovered around 44%, raising concerns about the applicability of zero-shot retrieval amid complex datasets. The researchers acknowledged, "Despite the potential of foundation models, their application in zero-shot WSI retrieval within histopathology remains underexplored."

Organ-specific performance revealed remarkable discrepancies, with models like Yottixel-UNI achieving impressive F1 scores of 82% for kidney tissues, bringing attention to the relative homogeneity of certain organ structures. By comparison, the retrieval accuracy for complex organs like the lungs and cervix significantly dropped, illustrating the difficulties associated with heterogeneous tissue patterns. Such findings suggest foundational models excel primarily within specific contexts but face challenges generalizing across diverse histological patterns.

Specifically, lung tissues presented notable challenges, where the patch-based embeddings struggled, leading to top-1 F1 scores as low as 21%. The intricacies of lung tissue structures may outstrip the capabilities of current models, emphasizing the need for approaches capable of encapsulating both local and global features of whole-slide images.

GigaPath WSI’s aggregation method, aimed at simplifying retrieval processes, failed to deliver substantial performance gains, highlighting another consideration within the computational pathology field. This suggests the aggregation process may overlook valuable spatial information necessary for accurate diagnostic support.

Future research must target developing novel WSI-level embedding techniques, embracing hybrid models amalgamated from patch-based architectures and large-scale representations. Conducting focused studies on organ-specific datasets will also be pivotal for refining the retrieval performance for complex structures.

Conclusively, the research elucidates the need for more sophisticated methodologies to advance WSI retrieval accuracy, with zero-shot settings presenting unique yet surmountable challenges. The authors aptly state, "There is a need for the development of new WSI-level embedding techniques..." as they advocate for continued exploration to ascend past the clinical thresholds currently observed.