Precision Oncology Meets Artificial Intelligence: Navigating Ancestry-Associated Variability in Digital Pathology for EGFR Prediction

Precision Oncology Meets Artificial Intelligence: Navigating Ancestry-Associated Variability in Digital Pathology for EGFR Prediction

Highlights

  • Open-source AI pathology models show promising AUCs (up to 0.83) for predicting EGFR mutations directly from H&E slides, potentially accelerating clinical decision-making.
  • Significant performance disparities exist across ancestral groups, with models performing notably worse in patients of Asian ancestry (AUC 0.68) compared to those of European (AUC 0.84) or African (AUC 0.85) ancestry.
  • Tissue context remains a critical variable, as AI performance significantly declines in pleural specimens compared to primary lung tissue samples.
  • AI-guided triage could potentially reduce the need for rapid molecular testing by up to 57%, optimizing resource allocation in precision oncology.

Background

The management of lung adenocarcinoma (LUAD) has been revolutionized by the identification of targetable oncogenic drivers, most notably mutations in the epidermal growth factor receptor (EGFR). Identification of these mutations is essential for initiating tyrosine kinase inhibitor (TKI) therapy, which significantly improves survival outcomes compared to traditional chemotherapy. However, conventional molecular testing methods, such as next-generation sequencing (NGS) and polymerase chain reaction (PCR), often involve significant turnaround times (1–3 weeks) and require substantial tissue quantity, which can delay the initiation of life-saving therapy.

Artificial intelligence (AI), specifically deep learning models trained on whole-slide images (WSIs) of hematoxylin-eosin (H&E)–stained slides, has emerged as a disruptive solution. These models aim to identify morphological patterns—often imperceptible to the human eye—that correlate with specific genomic alterations. While early proof-of-concept studies demonstrated the feasibility of “image-to-mutation” prediction, a critical unmet need remains: ensuring these models are robust, generalizable, and equitable across diverse global populations and various clinical specimen types.

Key Content

Chronological Development of AI in Lung Cancer Pathology

The journey of AI-based mutation prediction began with foundational studies (e.g., Coudray et al., 2018) demonstrating that convolutional neural networks (CNNs) could distinguish between LUAD and squamous cell carcinoma and predict common mutations like EGFR and KRAS with moderate accuracy. Following this, multiple “black-box” and interpretable models were developed. Recently, the focus has shifted from internal validation within single institutions to large-scale, multi-institutional external validation. The study by Rakaee et al. (2026) represents a pivotal milestone in this progression, moving beyond simple accuracy metrics to investigate the socio-biological determinants of model performance, specifically genetic ancestry.

Evidence by Model Architecture and Performance

The current evidence base involves two primary open-source AI pathology models. In the Dana-Farber Cancer Institute (DFCI) cohort (n = 1759), one model demonstrated a superior ability to predict EGFR status with an AUC of 0.83 (95% CI, 0.81-0.85), while the second model lagged significantly at an AUC of 0.68. This discrepancy highlights the impact of training architectures and the diversity of training data on model robustness. In the European TNM-I validation cohort (n = 339), these models maintained relatively consistent performance (AUCs of 0.81 and 0.75, respectively), suggesting a degree of geographic generalizability across Western populations.

Performance Disparities by Genetic Ancestry

Perhaps the most significant finding in recent literature is the performance variability when patients are stratified by genetic ancestry. Using germline genotype data to infer ancestry, researchers found that the high-performing model maintained high accuracy in European (AUC 0.84) and African (AUC 0.85) subgroups. However, a stark decline was observed in the Asian ancestry subgroup (AUC 0.68). This is particularly concerning given that EGFR mutations are most prevalent in Asian populations (up to 50% of LUAD cases). This divergence suggests that the morphological manifestations of EGFR mutations may differ across ancestral backgrounds or that the underlying training sets—predominantly composed of European-derived data—fail to capture the subtle features present in Asian patients.

Methodological Challenges: Specimen Type and Triage Utility

The clinical utility of AI models is also dictated by the specimen source. Analysis of sample types revealed that model performance is optimized on lung tissue specimens (AUC 0.86) but falters significantly in pleural specimens (AUC 0.66). This likely reflects the different stromal environments and cellular compositions of metastatic sites versus primary tumors, which may obscure the morphological cues the AI relies upon.

Despite these limitations, AI models offer a high potential for clinical triage. By implementing a high-confidence threshold for AI predictions, clinicians could potentially bypass the need for rapid EGFR testing in 57% of patients while maintaining a specificity of 0.99. This “triage-positive” approach ensures that only the most likely candidates are prioritized for rapid molecular testing, saving costs and time for more complex cases.

Expert Commentary

The findings by Rakaee et al. underscore both the promise and the peril of integrating AI into oncology. From a clinical perspective, the ability to predict EGFR status from a routine H&E slide within minutes is an extraordinary advancement. However, the ancestry-associated variability is a critical “red flag.” If an AI model is less accurate for Asian patients—the very group most likely to benefit from EGFR-targeted therapies—its deployment could inadvertently exacerbate existing healthcare disparities.

Mechanistically, the lower performance in Asian cohorts and pleural samples suggests that AI models may be learning features associated with the tumor microenvironment or specific histological subtypes (e.g., lepidic vs. solid growth patterns) that correlate differently with EGFR mutations across populations. Experts suggest that future model development must prioritize “ancestry-aware” training, utilizing massive, diverse datasets from global biobanks to ensure equitable performance. Furthermore, the decline in performance in pleural samples suggests that models need to be specifically tuned for metastatic site morphologies rather than relying on a “one-size-fits-all” lung cancer algorithm.

Conclusion

AI-based pathology tools represent a transformative adjunct for EGFR prediction in lung cancer, offering a pathway to rapid triage and reduced molecular testing burdens. However, current models exhibit significant performance gaps related to genetic ancestry and specimen origin. Future research must focus on diversifying training datasets to include broader ancestral representation and optimizing models for diverse tissue contexts. Until these gaps are bridged, AI should be viewed as a preliminary screening tool—a “digital triage”—that complements, rather than replaces, gold-standard molecular testing. The pursuit of precision oncology must ensure that the “precision” of AI tools is equally distributed across all patient populations.

References

  • Rakaee M, Nassar AH, Tafavvoghi M, et al. Ancestry-Associated Performance Variability of Open-Source AI Models for EGFR Prediction in Lung Cancer. JAMA Oncol. 2026;12:e256430. doi:10.1001/jamaoncol.2025.6430. PMID: 41678173.
  • Coudray N, Ocampo PS, Sakellaropoulos T, et al. Classification and mutation prediction from non-small cell lung cancer histopathology images using deep learning. Nat Med. 2018;24(10):1559-1567. PMID: 30224741.
  • Echle A, Rindtorff N, Brinker TJ, et al. Deep learning in cancer pathology: a new frontier for precision oncology. Cancer Cell. 2021;39(2):164-167. PMID: 33592176.

Comments

No comments yet. Why don’t you start the discussion?

Leave a Reply