ChatGPT use increasing in peer-reviewed papers, claims Stanford University study

Voltaire Staff
Apr 15, 2024
2 min read

Researchers have found at least 17 per cent of academic works to be containing AI-generated writing, which went beyond spell check and grammar fix, a recent study has revealed.

Published on March 11 on the arXiv preprint server, the study examined how AI chatbots may have influenced peer reviews submitted to four major computer science conferences since the emergence of ChatGPT.

According to the analysis, up to 17 per cent of the peer-review reports may have undergone significant modifications with the assistance of chatbots.

However, the study left unanswered questions about whether these tools were used to craft reviews from scratch or to enhance existing drafts.

In a recent study led by Weixin Liang, a computer scientist at Stanford University in California, researchers developed a method to identify AI-generated text by detecting adjectives more commonly used by AI than by humans.

By analysing over 146,000 peer reviews submitted to the same conferences before and after ChatGPT's release, the study revealed a notable increase in the frequency of certain positive adjectives, such as 'commendable', 'innovative', 'notable', and 'versatile', after the mainstream adoption of the chatbot.

The research pinpointed the 100 most disproportionately used adjectives.

"It seems like when people have a lack of time, they tend to use ChatGPT," says Liang, according to Nature.

"Our results suggest that between 6.5% and 16.9% of text submitted as peer reviews to these conferences could have been substantially modified by LLMs, i.e. beyond spell-checking or minor writing updates.

"The circumstances in which generated text occurs offer insight into user behavior: the estimated fraction of LLM-generated text is higher in reviews which report lower confidence, were submitted close to the deadline, and from reviewers who are less likely to respond to author rebuttals," claimed the paper published by Stanford researchers.

Debora Weber-Wulff, a computer scientist at the HTW Berlin–University of Applied Sciences in Germany told Nature, "It’s the expectation that a human researcher looks at it." She added, "AI systems 'hallucinate,' and we can’t know when they’re hallucinating and when they’re not."

According to Weber-Wulff the idea of chatbots writing referee reports for unpublished work is "very shocking" given that the tools often generate misleading or fabricated information.

Since its launch in November 2022, ChatGPT has been employed to draft numerous scientific papers, with instances where it's even credited as an author.

According to a 2023 survey conducted by Nature, out of over 1,600 respondents, nearly 30 per cent reported using generative AI to write papers, while about 15 per cent used it for their literature reviews and grant applications.

The finding prompted Andrew Gray, a bibliometrics support officer at University College London, to examine how frequently similar adjectives and a range of adverbs appear in peer-reviewed studies published from 2015 to 2023.

"We have the signs that these things are being used but we don’t really understand how they’re being used," he told Nature.

Debora Weber-Wulff said that employing chatbots for peer review might raise copyright concerns, as it could entail granting these tools access to confidential, unpublished content.