^{News
July 17, 2025

5

Large Language Models (LLMs), Peer Review}

Scientist Sleuths Detect Significant Use of AI in Journal Papers

_{The researchers found that, after chatbots (LLMs) became available, writing styles shifted from content words to more “stylistic and flowery” word choices} _{News
July 17, 2025

5

Large Language Models (LLMs), Peer Review}

Share: Facebook; Twitter/X; LinkedIn; Flipboard; Print; Email

At Phys.org, science writer Charles Blue reports on an effort to track the use of AI to write papers — and the sobering results:

To shed light on just how widespread LLM content is in academic writing, a team of U.S. and German researchers analyzed more than 15 million biomedical abstracts on PubMed to determine if LLMs have had a detectable impact on specific word choices in journal articles.

Their investigation revealed that since the emergence of LLMs there has been a corresponding increase in the frequency of certain stylist word choices within the academic literature. These data suggest that at least 13.5% of the papers published in 2024 were written with some amount of LLM processing. The results appear in the open-access journal Science Advances.

“Massive study detects AI fingerprints in millions of scientific papers,” July 6, 2025

Analyzing language patterns, the authors found that, after chatbots (large language models or LLMs) became available three years ago, writing styles shifted significantly: “away from the excess use of “content words” to an excess use of “stylistic and flowery” word choices, such as “showcasing,” “pivotal,” and “grappling.”

Ai Chatbots working and chatting in computer

From the open-access paper:

Large language models (LLMs) like ChatGPT can generate and revise text with human-level performance. These models come with clear limitations, can produce inaccurate information, and reinforce existing biases. Yet, many scientists use them for their scholarly writing. But how widespread is such LLM usage in the academic literature? To answer this question for the field of biomedical research, we present an unbiased, large-scale approach: We study vocabulary changes in more than 15 million biomedical abstracts from 2010 to 2024 indexed by PubMed and show how the appearance of LLMs led to an abrupt increase in the frequency of certain style words. This excess word analysis suggests that at least 13.5% of 2024 abstracts were processed with LLMs. This lower bound differed across disciplines, countries, and journals, reaching 40% for some subcorpora. We show that LLMs have had an unprecedented impact on scientific writing in biomedical research, surpassing the effect of major world events such as the COVID pandemic.

Dmitry Kobak et al. ,Delving into LLM-assisted writing in biomedical publications through excess vocabulary. Sci. Adv.11,eadt3813(2025).DOI:10.1126/sciadv.adt3813

The authors noted that their detection method turned up considerable discrepancy between fields of study and suggest reasons for it:

Our estimated lower bound on LLM usage ranged from below 5% to more than 40% across different PubMed-indexed research fields, affiliation countries, and journals. This heterogeneity could correspond to actual differences in LLM adoption. For example, the high lower bound on LLM usage in computational fields (20%) could be due to computer science researchers being more familiar with and willing to adopt LLM technology. In non-English speaking countries, LLMs can help authors with editing English texts, which could justify their extensive use. Last, authors publishing in journals with expedited and/or simplified review processes might be grabbing for LLMs to write low-effort articles. Through excess vocabulary

But they qualify their assumptions:

It is possible that native and non-native English speakers actually use LLMs equally often, but native speakers may be better at noticing and actively removing unnatural style words from LLM outputs. Our method would not be able to pick up the increased frequency of such more advanced LLM usage. Through excess vocabulary

Kobak et al. worry that what is gained in correct English may be lost in content because “LLMs are infamous for making up references (10), providing inaccurate summaries, and making false claims that sound authoritative and convincing.”

The authors of any given paper actually did the work that resulted in the study and will therefore be attuned to even small discrepancies. The LLM did not do anything. So the more LLMs are relied on, the less likely it is that incorrect or false material will be spotted pre-publication unless it is truly bizarre.

The study authors also worry about LLM plagiarism and homogeneity:

This makes LLM outputs less diverse and novel than human-written text. Such homogenization can degrade the quality of scientific writing. For instance, all LLM-generated introductions on a certain topic might sound the same and would contain the same set of ideas and references, thereby missing out on innovations and exacerbating citation injustice. Even worse, it is likely that malign actors such as paper mills will use LLMs to produce fake publications.

It would be remarkable if the paper mills are not already deep into LLM-generated papers.

The fact that there may be no turning back does not mean, of course, that we will all just muddle through somehow. Reliance on machine-written material may signal a long and continuing decline in the quality of research in many fields. Paper mills may even become irrelevant if machine-written papers are the norm. We shall see.

Remembering pre-LLMs 2018 fondly:

Quantity vs quality: Can AI help scientists produce better papers? What happens when scientists simply can’t read all their peers’ papers and still find time for original research? Quantity is definitely a solved problem. STM, the “voice of scholarly publishing” estimated in 2015 that roughly 2.5 million science papers are published each year. Some are, admittedly, in predatory or fake journals. But over 2800 journals are assumed to be genuine. (November 30, 2018)

Starting to get the picture in 2024:

Can AI help stem the tide of fake science papers? One problem is that science journals don’t do a very good job of establishing author identities. Chatbots are bound to make things worse. AI can help detect fake papers but it is not going to save researchers from a partly AI-driven dystopia. (May 16, 2024)