ChatGPT-4o versus Gynecologic Oncologists in Endometrial Cancer Patient Communication: A Prospective Comparative Study

Study Background and Disease Burden

Endometrial cancer represents one of the most common gynecologic malignancies worldwide, with increasing incidence attributed to aging populations and rising obesity rates. Effective patient communication regarding risk factors, preventive strategies, diagnostic procedures, and treatment options is essential to optimize patient outcomes and quality of life. However, time constraints in clinical encounters and variability in communication skills may limit thorough patient education. This unmet need presents an opportunity to explore artificial intelligence (AI) applications like ChatGPT-4o to supplement clinical interactions and improve patient understanding and support. By examining AI performance against specialized gynecologic oncologists in addressing patient inquiries about endometrial cancer, this study provides timely insights into AI’s role within the evolving oncology care landscape.

Study Design

This prospective comparative study utilized a validated set of 100 patient-oriented questions about endometrial cancer, equally divided into two domains: primary care (focused on risk factors and prevention) and secondary care (focused on diagnosis and treatment). These questions were carefully selected and reviewed by expert specialists to ensure clinical relevance and represent common patient concerns.

Each question was answered independently by ChatGPT-4o and a board-certified gynecologic oncologist. Subsequently, two independent oncologists evaluated the responses for accuracy, empathy, and completeness using a standardized 4-point Likert scale (higher scores indicate better performance). Additional metrics included word count analysis and readability scores to assess the length and comprehensibility of answers. Statistical comparisons were conducted to determine the significance of observed differences.

Key Findings

ChatGPT-4o significantly outperformed the gynecologic oncologist across all evaluated domains. Specifically, its accuracy score averaged 3.86 compared to 3.36 for the oncologist (p < 0.001), indicating superior correctness and factual reliability.

In terms of empathy, an area traditionally challenging for AI, ChatGPT-4o scored 3.47, markedly higher than the physician’s 1.66 (p < 0.001). This suggests that AI-generated responses were more sensitive and patient-centered, potentially enhancing emotional support.

When assessed for completeness, ChatGPT-4o’s answers were more comprehensive (3.00 vs. 1.97; p < 0.001). The AI provided thorough explanations encompassing multiple aspects of each question, whereas the physician answers tended to be concise but less detailed.

Notably, ChatGPT-4o responses had a substantially greater word count (mean 403.51 words) than those of the oncologist (mean 25.06 words), which, while contributing to completeness, might overwhelm some patients. Readability analyses revealed that both AI and physician texts required a similarly high literacy level, indicating ongoing challenges in delivering accessible information.

Subanalyses comparing primary versus secondary care questions showed consistent superiority of ChatGPT-4o, suggesting its efficacy across the care continuum.

Expert Commentary

These findings challenge preconceived notions about AI limitations in clinical communication, particularly regarding empathy. The ability of ChatGPT-4o to simulate compassionate language and deliver detailed, accurate medical information carries meaningful potential for oncology practice.

However, the substantially longer answers produced by AI may be perceived as overly complex or verbose, possibly hindering patient comprehension. This underscores the need for optimized AI algorithms balancing detail with clarity and patient literacy considerations.

Furthermore, integration of AI tools alongside human clinicians, especially oncology nurses who frequently provide patient education and psychosocial support, could enhance care quality without replacing critical human judgment. Experts highlight that AI’s role should be as an adjunct to enrich communication, not a substitute for clinician-patient relationships.

Limitations include the study’s single-oncologist comparator and evaluation of static written responses rather than interactive dialogue. Future multi-center, patient-involved studies assessing real-world utility and acceptance are warranted.

Conclusion

This prospective comparative study demonstrates that ChatGPT-4o surpasses gynecologic oncologists in accuracy, empathy, and completeness when addressing patient questions on endometrial cancer. While its verbose responses pose challenges, these findings suggest AI’s promising role to complement oncology nursing and patient education, improving supportive care.

To maximize clinical applicability, future AI enhancements must prioritize balancing informative depth with readability and tailoring communication to individual patient needs. Continued research should focus on integrating AI-driven tools within multidisciplinary care frameworks to optimize patient engagement and outcomes in endometrial cancer management.

References

İnan SA, İnan M, Türkmen O. ChatGPT-4o vs. oncologists in addressing endometrial cancer patient inquiries: A prospective comparative study in primary and secondary care. Eur J Oncol Nurs. 2025 Aug;77:102930. doi: 10.1016/j.ejon.2025.102930. Epub 2025 Jul 17. PMID: 40706414.