Introduction
Peer review is the cornerstone of evidence-based medicine, ensuring the credibility and quality of published scientific research. Yet, this process faces mounting challenges: an increasing volume of manuscripts, reviewer fatigue, and concerns regarding efficiency, bias, and reliability. With the peer review system stretched thin, innovative approaches are crucial to sustain its integrity and responsiveness.
Artificial intelligence (AI), particularly large language models (LLMs), has emerged as a promising tool to assist and potentially transform peer review. This article critically examines the opportunities, challenges, and strategies for integrating AI into peer review, focusing on balancing technological advances with human accountability.
Background and Challenges in Peer Review
The expansion of scientific publishing has increased demands on peer reviewers, many of whom report fatigue and disengagement. Traditional peer review suffers from inefficiencies that delay dissemination of critical findings and may introduce inconsistencies or biases. Additionally, peer review sometimes fails to detect methodological errors, poor-quality studies, or fraudulent data.
Efforts to address these issues have included enhanced reviewer training, mentoring programs, and the use of software for matching reviewers with appropriate manuscripts. However, these measures alone have not sufficiently expanded the reviewer pool or enhanced the speed and quality of reviews.
Potential of Artificial Intelligence in Peer Review
AI, especially LLMs, can rapidly process and summarize complex manuscripts, extract key features, and facilitate interactive reviewer engagement via question-and-answer systems. They can automate routine editorial tasks such as checking adherence to submission guidelines, detecting missing reporting elements, verifying data consistency across different manuscript sections, and generating summaries.
Such automations potentially reduce reviewers’ workload, mitigate fatigue, and accelerate editorial decision-making. For instance, AI could function similarly to established tools like plagiarism detectors, providing an auxiliary layer of quality assurance without replacing human judgment.
Limitations and Risks of AI Integration
Despite these advantages, current AI models have notable limitations. They can generate false positives (flagging non-issues) and false negatives (missing substantive errors). Importantly, AI cannot yet replicate human expertise in evaluating novelty, clinical relevance, or methodological rigor, which require contextual judgment and ethical reasoning.
Risks also arise around confidentiality—uploading manuscripts to public AI platforms risks data leaks and intellectual property misuse. Moreover, unequal access to AI tools may exacerbate disparities among reviewers and institutions.
AI may produce plausible but incorrect outputs due to confabulation (hallucination), requiring human reviewers to meticulously verify AI-derived insights, which could paradoxically increase workload.
Bias in AI-generated content is a critical concern. Models may unintentionally favor certain topics, methodologies, or linguistic styles, and while they do not harbor personal biases, their training data may embed systemic prejudices. Furthermore, reliance on AI could lead to cognitive offloading, diminishing critical thinking and fostering a homogenization of scientific discourse.
Current Guidelines and Ethical Considerations
Leading publishers and editorial bodies, including the JAMA Network and the International Committee of Medical Journal Editors (ICMJE), have developed policies to guide AI use. Key principles include:
– Prohibition of AI tools as manuscript authors due to the inability of AI to assume accountability.
– Mandatory disclosure by authors and reviewers when AI tools contribute to writing or reviewing.
– Prohibition of uploading confidential manuscripts to unsecured AI platforms.
– Maintenance of ultimate editorial and reviewer responsibility despite AI assistance.
These measures preserve ethical standards and help maintain trust in the peer review process.
Strategies for Implementation: Hybrid Human-AI Models
Recognizing both the potential and pitfalls of AI, journals like those in the JAMA Network advocate hybrid models where AI tools support but do not replace human reviewers and editors.
Such models could include:
– AI-generated parallel reviews focusing on specific aspects like methodological fidelity or compliance with reporting standards.
– AI-assisted meta-reviews synthesizing multiple human reviews into structured recommendations.
– AI copilot systems that help human reviewers with summarization and error detection while leaving ultimate judgment to the reviewers.
This approach parallels driver-assistance technologies that augment but do not replace human control, preserving human oversight and accountability.
Addressing Challenges Through Ongoing Research and Policy
Continuous scientific study is pivotal to evaluating which AI applications improve peer review quality without compromising fairness or security. Conferences such as the International Congress on Peer Review and Scientific Publication facilitate dissemination of such research.
Journals are exploring:
– Empirical assessments of AI impact on review timeliness and quality.
– Methods to mitigate AI biases and ensure equitable access.
– Protocols for safeguarding confidentiality.
– Strategies to avoid reward hacking, where authors might optimize manuscripts primarily for AI algorithms rather than scientific clarity.
Effective policies and quality improvement cycles will guide responsible AI adoption in editorial workflows.
Conclusion
Artificial intelligence holds significant promise to augment the peer review process by automating routine tasks and supporting reviewers, thereby addressing challenges such as reviewer fatigue and inefficiencies. However, AI’s current limitations—in context interpretation, ethical judgment, and error-free analysis—necessitate cautious, hybrid implementations that maintain human oversight.
Ethical guidelines, empirical evaluation, and equitable access are essential to harness AI’s benefits while mitigating risks like confidentiality breaches, bias, and reduced critical engagement. Ultimately, AI should be conceptualized as a copilot to human expertise rather than a replacement, preserving the scientific rigor, fairness, and accountability fundamental to credible peer review.
As the field advances, ongoing research and thoughtful policy will ensure that AI enriches the peer review process, accelerating the dissemination of high-quality medical science to benefit clinicians, researchers, and patients alike.
References
1. Perlis RH, Christakis DA, Bressler NM, et al. Artificial Intelligence in Peer Review. JAMA. Published online August 28, 2025. doi:10.1001/jama.2025.15827
2. International Committee of Medical Journal Editors (ICMJE). Recommendations for the Conduct, Reporting, Editing, and Publication of Scholarly Work in Medical Journals. Available at: http://www.icmje.org/icmje-recommendations.pdf
3. Tennant JP, Ross-Hellauer T. The limitations to our understanding of peer review. Research Integrity and Peer Review. 2020;5:6. doi:10.1186/s41073-020-00092-1
4. Lee CJ, Sugimoto CR, Zhang G, Cronin B. Bias in peer review. J Am Soc Inf Sci Technol. 2013;64(1):2-17. doi:10.1002/asi.22784
5. Erren TC, Erren M, Buddeberg-Fischer B. Ethical standards in scientific publishing: The issue of ghost authorship. Dtsch Arztebl Int. 2009;106(31-32):548-553. doi:10.3238/arztebl.2009.0548
6. Resnik DB, Elmore SA. Ensuring the integrity and quality of peer review in biomedical journals. Am J Bioeth. 2016;16(9):34-36. doi:10.1080/15265161.2016.1203263