AI and Peer Review: Opportunity and Risk

17.09.2025

At the 10th International Congress on Peer Review and Scientific Publication, held in Chicago in September 2025, one theme dominated almost every plenary: artificial intelligence. AI is already woven into the peer review process, reshaping how authors write, how reviewers evaluate, and how journals screen submissions. The Congress provided a glimpse of where AI is helping, where it is undermining trust, and where it may take us next.

Authors and AI: Quiet Uptake, Spotty Disclosure

Several large studies revealed how rapidly authors have embraced generative AI.

A JAMA Network analysis of 82,829 manuscripts across 13 journals found that author AI use more than doubled from 2023 to 2025 (1.6% → 4.2%). Disclosures showed most use was for language editing (50%), with smaller but significant proportions for statistical analysis (12%) and content drafting (8%). ChatGPT was the most frequently cited model, accounting for 63% of mentions.
A BMJ study of 25,114 submissions found only 7% of authors disclosed AI use, even though independent surveys suggested 50–76% actually did. The most common uses were language improvement (87%) and translation (27%). The gap between disclosure and reality was striking.
A survey of Chinese medical researchers (n=159) found 59% used AI, particularly for translation and English polishing. Early-career researchers were the most frequent adopters. Very few admitted this use in formal submissions.

Together these studies reveal a paradox: AI is now a routine part of manuscript preparation, but disclosure rates are far below actual usage. This undercuts transparency and makes it difficult for journals to track how research communication is changing.

Reviewers and AI: A Ban Proves Ineffective

If AI is helping authors, what about reviewers?

A study of 46,500 abstracts and 29,544 reviews in AACR journals showed reviewer use of AI rose steeply after ChatGPT became widely available. When the journals formally prohibited AI, detections in reviewer comments dropped by half — but only briefly. Soon they climbed again, showing that bans did not stop reviewers from experimenting with AI; they simply made it less visible.

JAMA Network data also offered insights: reviewers who disclosed AI use had only slightly slower turnaround times (12.6 vs 11.8 days) and their reports were rated about the same by editors as those from non-AI reviewers. In other words, reviewers are using AI, editors often cannot tell the difference, and bans have limited effect.

The main issue is disclosure. Should reviewers who use AI to clean up language, summarize methods, or check compliance with CONSORT have to declare it? Many at the congress argued yes, but disclosure policies remain inconsistent across journals.

AI as a Peer Reviewer: The NEJM “AI Fast Track”

Perhaps the boldest experiment came from The New England Journal of Medicine (NEJM), which piloted an “AI Fast Track.”

Using GPT-5 and Gemini Pro, NEJM tested whether large language models could act as first-pass reviewers for clinical trial submissions. The AI flagged methodological flaws and statistical inconsistencies that some human reviewers missed. For example, it identified implausible sample size justifications and incomplete descriptions of randomization methods.

Still, NEJM stressed that human oversight was essential. The AI could detect anomalies but lacked contextual judgment. For example, distinguishing between a critical methodological flaw and a simple reporting omission. The conclusion: AI can complement human review, but it cannot yet replace it.

AI in Integrity Screening

Beyond authors and reviewers, journals are deploying AI in submission triage and integrity checks.

PLOS described how it scaled integrity screening using a combination of policies, duplicate-submission checks from the STM Integrity Hub, image analysis, and targeted study-type audits. The result: desk rejections increased from 13% in 2021 to 40% in 2025. AI and semi-automated dashboards helped editors identify likely paper-mill submissions early, conserving reviewer capacity.
Other publishers presented papers using AI to detect image manipulation and citation quotation error. In one study, AI reduced the time to check trial outcomes against registries from 27 minutes (human) to just 2 minutes, while maintaining comparable accuracy.

These uses of AI are less controversial than reviewers adopting ChatGPT on their own: they are transparent, controlled, and overseen by editors. They also address urgent problems like paper mills and research misconduct.

Risks: Misuse, Manipulation, and Missing Accountability

While the opportunities are obvious, the risks of AI in peer review came up repeatedly:

Misuse and under-reporting. Authors and reviewers are already using AI far more than they admit, making disclosure unreliable.
Manipulation by paper mills. Fraudulent operators adapt quickly to detection tools, learning how to disguise AI-generated text or images.
Loss of accountability. Instances were reported where ChatGPT was listed as a co-author, a category error that underlines how AI use can erode responsibility for research claims.
Over-trust in machines. AI may appear authoritative but can be wrong in subtle or systematic ways. Without human oversight, flawed recommendations could slip through.

The Drummond Rennie Lecture by Ana Marušić stressed that authorship and peer review must remain human responsibilities. Contributor taxonomies like CRediT and persistent identifiers for people (ORCID) and groups (ROR) are essential to preserve accountability in an AI-infused environment.

The Opportunity: A Hybrid Future

Despite concerns, optimism was evident. Most presenters agreed the future is hybrid:

AI will act as a scalable assistant, highlighting missing checklist items, spotting statistical anomalies, and scanning for fraud at speeds humans cannot match.
Humans will provide judgment, context, and accountability, making decisions that machines cannot.

This hybrid model could also support equity. Non-native English speakers and early-career reviewers are more likely to use AI. If provided transparently and within editorial platforms, AI could reduce language barriers and support less experienced reviewers making the process more inclusive.

Looking Ahead

The Peer Review Congress made it clear: AI is not a distant threat or promise. It is already here, in author manuscripts, in reviewer reports, and in editorial workflows. The risks are real: fraud, opacity, misplaced trust. But so are the opportunities: faster integrity checks, more reproducible science, and support for a more diverse reviewer pool.

As one delegate put it: “AI is like the calculator for peer review. At first controversial, then indispensable, but only if we use it responsibly.”

Closing

This post is the first deep-dive in a three-part series reflecting on the Peer Review Congress, which I introduce previously*. Here, I highlighted AI and peer review: the opportunities and risks. Next, I’ll examine how journals and publishers are fighting back against paper mills, fake authors, and research misconduct at scale. Then I’ll turn to the future of peer review — incentives, preprints, and the human factor.

The debates in Chicago showed that AI will not destroy peer review, but it will change it. The challenge for the scholarly community is not whether to use AI, but how to do so with transparency, accountability, and integrity.

– By Tony Alves

Read the next part

Read the previous part

Latest news and blog articles

Poster of FSMB and HighWire partnership with logos of organizations

News

25.06.2025

Federation of State Medical Boards Selects HighWire to Host Journal of Medical Regulation

News

17.02.2025

HighWire Press Revamps the American Journal of Neuroradiology’s Website to Enhance User Experience

Transform your digital services

Speak to an expert