Mind Matters Natural and Artificial Intelligence News and Analysis
a-peer-review-process-illustrated-through-a-meeting-of-exper-1032441092-stockpack-adobe_stock
A peer review process illustrated through a meeting of experts discussing research papers, surrounded by laptops and notes

AI Peer Review Called “Inevitable” by Some, “Disaster” by Others

The whole debate raises a question: How much original thought goes into peer review anyway? And what purpose does it ultimately serve?
Share
Facebook
Twitter
LinkedIn
Flipboard
Print
Email

At Nature, science writer Miriam Naddaf reports that artificial intelligence, in the form of chatbots or LLMs, is increasingly used in reviewing papers. Some scientists worry about the direction:

A pile of research journals and books on a desk, with sticky notes marking important studies

AI systems are already transforming peer review — sometimes with publishers’ encouragement, and at other times in violation of their rules. Publishers and researchers alike are testing out AI products to flag errors in the text, data, code and references of manuscripts, to guide reviewers toward more-constructive feedback, and to polish their prose. Some new websites even offer entire AI-created reviews with one click.

But with these innovations come concerns. Although today’s AI products are cast in the role of assistants, AI might eventually come to dominate the peer-review process, with the human reviewer’s role reduced or cut out altogether.

“AI is transforming peer review — and many scientists are worried,” March 26, 2025

Relentless advance?

One study found that between 7% and 17% of peer-review reports submitted to AI conferences in 2023 and 2024 showed signs of being “substantially modified” by LLMs.

Some see AI peer review as inevitable, Naddaf notes, and some see it as a “disaster.”

Given chatbots’ well-deserved reputation for hallucination and model collapse, here’s one risk: In a world where few are paying much attention anyway, an AI review may be very off-base but not in a way that attracts immediate attention. Thus damage to the database grows undetected.

The debate on the pros and cons seems tinged with an air of resignation:

Using offline LLMs to rephrase one’s notes can speed up and sharpen the process of writing reviews, so long as the LLMs don’t “crank out a full review on your behalf”, wrote Dritjon Gruda, an organizational-behaviour researcher at the Catholic University of Portugal in Lisbon, in a Nature careers column.

But “taking superficial notes and having an LLM synthesize them falls far, far short of writing an adequate peer review”, counters Carl Bergstrom, an evolutionary biologist at the University of Washington in Seattle. If reviewers start relying on AI so that they can skip most of the process of writing reviews, they risk providing shallow analysis. “Writing is thinking,” Bergstrom says. “Many scientists are worried,

Yes, writing is thinking. But for that very reason, the whole debate raises a question: How much original thought goes into peer review anyway? And what purpose does it ultimately serve?

A layperson might fairly ask the scientist, if you’re not going to read the article, why generate copy about it? If you are going to read it, why do you need the chatbot to write the review?

Peer review got started with the explosion of science in the 20th century. It’s helpful to keep in mind that when Albert Einstein (1879–1955) wrote three seminal physics papers in 2005, his only reviewers were his editors at Annalen der Physik, some of whom were Nobel Prize winners themselves. The AI trend might spark a healthy discussion of what, exactly, peer review is doing these days. Is there a better way of doing it? Some are experimenting, for example, with paid and open peer review, to alleviate the time and sensitivity issues.

A new approach that Naddaf notes is to use AI — not to generate a review — but to provide conventional feedback to already written material. Several AI programs she discusses can identify unclear sentences, incorrect facts, or questionable citations. Thus, the scientist can remain engaged with the work while saving time on grammar-checking, fact-checking, etc. With luck, that’s the best of both worlds.

In other news, AI is not poised to take over science anyway

Meanwhile, in a development certainly not predicted by pop science media, science writer Ben Turner tells us at LiveScience that “76% of scientists said that scaling large language models was ‘unlikely’ or ‘very unlikely’” to bring about machines that think like humans. That is pretty much the opposite of what we’ve been hearing for years from Sam Altman and OpenAI:

Now, as recent model releases appear to stagnate, most of the researchers polled by the Association for the Advancement of Artificial Intelligence believe tech companies have arrived at a dead end — and money won’t get them out of it.

“I think it’s been apparent since soon after the release of GPT-4, the gains from scaling have been incremental and expensive,” Stuart Russell, a computer scientist at the University of California, Berkeley who helped organize the report, told Live Science. “[AI companies] have invested too much already and cannot afford to admit they made a mistake [and] be out of the market for several years when they have to repay the investors who have put in hundreds of billions of dollars. So all they can do is double down.”

“Current AI models a ‘dead end’ for human-level intelligence, scientists agree,” March 27, 2025

That’s pretty blunt, almost cynical. But the 475 AI researchers surveyed are confronting a reality: fundamental limitations in the architecture of the chatbots used to generate, say, plausible-sounding peer reviews or essays.

Projections also show the finite human-generated data essential for further growth will most likely be exhausted by the end of this decade. Once this has happened, the alternatives will be to begin harvesting private data from users or to feed AI-generated “synthetic” data back into models that could put them at risk of collapsing from errors created after they swallow their own input. Scientists agree

That risk is called model collapse. Another problem, about which Gary Smith has written extensively here at Mind Matters News, is hallucination — the bears in space problem, where the system, stuck for an answer, just makes something up.

So the human reviewer’s role is not at risk of anything but neglect.


Mind Matters News

Breaking and noteworthy news from the exciting world of natural and artificial intelligence at MindMatters.ai.

AI Peer Review Called “Inevitable” by Some, “Disaster” by Others