Some scientists can't stop using AI to write research papers

https://www.theregister.com/2024/05/03/ai_scientific_articles/

Linguistic and statistical analyses of scientific articles suggest that generative AI may have been used to write an increasing amount of scientific literature.

Two academic papers assert that analyzing word choice in the corpus of science publications reveals an increasing usage of AI for writing research papers. One study, published in March by Andrew Gray of University College London in the UK, suggests at least one percent – 60,000 or more – of all papers published in 2023 were written at least partially by AI.

A second paper published in April by a Stanford University team in the US claims this figure might range between 6.3 and 17.5 percent, depending on the topic.

Both papers looked for certain words that large language models (LLMs) use habitually, such as “intricate,” “pivotal,” and “meticulously." By tracking the use of those words across scientific literature, and comparing this to words that aren't particularly favored by AI, the two studies say they can detect an increasing reliance on machine learning within the scientific publishing community.

In Gray's paper, the use of control words like "red," "conclusion," and "after" changed by a few percent from 2019 to 2023. The same was true of other certain adjectives and adverbs until 2023 (termed the post-LLM year by Gray).

In that year use of the words "meticulous," "commendable," and "intricate," rose by 59, 83, and 117 percent respectively, while their prevalence in scientific literature hardly changed between 2019 and 2022. The word with the single biggest increase in prevalence post-2022 was “meticulously”, up 137 percent.

The Stanford paper found similar phenomena, demonstrating a sudden increase for the words "realm," "showcasing," "intricate," and "pivotal." The former two were used about 80 percent more often than in 2021 and 2022, while the latter two were used around 120 and almost 160 percent more frequently respectively.

The researchers also considered word usage statistics in various scientific disciplines. Computer science and electrical engineering were ahead of the pack when it came to using AI-preferred language, while mathematics, physics, and papers published by the journal Nature, only saw increases of between five and 7.5 percent.

The Stanford bods also noted that authors posting more preprints, working in more crowded fields, and writing shorter papers seem to use AI more frequently. Their paper suggests that a general lack of time and a need to write as much as possible encourages the use of LLMs, which can help increase output.

Potentially the next big controversy in the scientific community

Using AI to help in the research process isn't anything new, and lots of boffins are open about utilizing AI to tweak experiments to achieve better results. However, using AI to actually write abstracts and other chunks of papers is very different, because the general expectation is that scientific articles are written by actual humans, not robots, and at least a couple of publishers consider using LLMs to write papers to be scientific misconduct.

Using AI models can be very risky as they often produce inaccurate text, the very thing scientific literature is not supposed to do. AI models can even fabricate quotations and citations, an occurrence that infamously got two New York attorneys in trouble for citing cases ChatGPT had dreamed up.

"Authors who are using LLM-generated text must be pressured to disclose this or to think twice about whether doing so is appropriate in the first place, as a matter of basic research integrity," University College London’s Gray opined.

The Stanford researchers also raised similar concerns, writing that use of generative AI in scientific literature could create "risks to the security and independence of scientific practice." ®

{
"by": "pseudolus",
"descendants": 11,
"id": 40247249,
"kids": [
40247634,
40248156,
40248035
],
"score": 12,
"time": 1714741539,
"title": "Some scientists can't stop using AI to write research papers",
"type": "story",
"url": "https://www.theregister.com/2024/05/03/ai_scientific_articles/"
}
{
"author": "Matthew Connatser",
"date": "2024-05-03T19:00:38.000Z",
"description": "If you read about ‘meticulous commendable intricacy’ there’s a chance a boffin had help",
"image": "https://regmedia.co.uk/2024/05/03/shutterstock_robot_scientist.jpg",
"logo": "https://logo.clearbit.com/theregister.com",
"publisher": "The Register",
"title": "Scientists increasingly using AI to write research papers",
"url": "https://www.theregister.com/2024/05/03/ai_scientific_articles/"
}
{
"url": "https://www.theregister.com/2024/05/03/ai_scientific_articles/",
"title": "Scientists increasingly using AI to write research papers",
"description": "Linguistic and statistical analyses of scientific articles suggest that generative AI may have been used to write an increasing amount of scientific literature. Two academic papers assert that analyzing word...",
"links": [
"https://www.theregister.com/2024/05/03/ai_scientific_articles/",
"https://www.theregister.com/AMP/2024/05/03/ai_scientific_articles/"
],
"image": "https://regmedia.co.uk/2024/05/03/shutterstock_robot_scientist.jpg",
"content": "<div>\n<p>Linguistic and statistical analyses of scientific articles suggest that generative AI may have been used to write an increasing amount of scientific literature.</p>\n<p>Two academic papers assert that analyzing word choice in the corpus of science publications reveals an increasing usage of AI for writing research papers. <a target=\"_blank\" href=\"https://arxiv.org/abs/2403.16887\">One study</a>, published in March by Andrew Gray of University College London in the UK, suggests at least one percent – 60,000 or more – of all papers published in 2023 were written at least partially by AI.</p>\n<p>A <a target=\"_blank\" href=\"https://arxiv.org/abs/2404.01268\">second paper</a> published in April by a Stanford University team in the US claims this figure might range between 6.3 and 17.5 percent, depending on the topic.</p>\n<p>Both papers looked for certain words that large language models (LLMs) use habitually, such as “intricate,” “pivotal,” and “meticulously.\" By tracking the use of those words across scientific literature, and comparing this to words that aren't particularly favored by AI, the two studies say they can detect an increasing reliance on machine learning within the scientific publishing community.</p>\n<p>In Gray's paper, the use of control words like \"red,\" \"conclusion,\" and \"after\" changed by a few percent from 2019 to 2023. The same was true of other certain adjectives and adverbs until 2023 (termed the post-LLM year by Gray).</p>\n<p>In that year use of the words \"meticulous,\" \"commendable,\" and \"intricate,\" rose by 59, 83, and 117 percent respectively, while their prevalence in scientific literature hardly changed between 2019 and 2022. The word with the single biggest increase in prevalence post-2022 was “meticulously”, up 137 percent.</p>\n<p>The Stanford paper found similar phenomena, demonstrating a sudden increase for the words \"realm,\" \"showcasing,\" \"intricate,\" and \"pivotal.\" The former two were used about 80 percent more often than in 2021 and 2022, while the latter two were used around 120 and almost 160 percent more frequently respectively.</p>\n<ul>\n<li><a target=\"_blank\" href=\"https://www.theregister.com/2023/08/02/beyond_the_hype_ai_promises/\">Beyond the hype, AI promises leg up for scientific research</a></li>\n<li><a target=\"_blank\" href=\"https://www.theregister.com/2024/03/19/ai_researchers_reviewing_peers/\">AI researchers have started reviewing their peers using AI assistance</a></li>\n<li><a target=\"_blank\" href=\"https://www.theregister.com/2024/04/11/google_deepmind_material_study/\">Boffins deem Google DeepMind's material discoveries rather shallow</a></li>\n<li><a target=\"_blank\" href=\"https://www.theregister.com/2024/04/03/ai_chatbots_persuasive/\">Turns out AI chatbots are way more persuasive than humans</a></li>\n</ul>\n<p>The researchers also considered word usage statistics in various scientific disciplines. Computer science and electrical engineering were ahead of the pack when it came to using AI-preferred language, while mathematics, physics, and papers published by the journal Nature, only saw increases of between five and 7.5 percent.</p>\n<p>The Stanford bods also noted that authors posting more preprints, working in more crowded fields, and writing shorter papers seem to use AI more frequently. Their paper suggests that a general lack of time and a need to write as much as possible encourages the use of LLMs, which can help increase output.</p>\n<h3>Potentially the next big controversy in the scientific community</h3>\n<p><a target=\"_blank\" href=\"https://www.theregister.com/2023/08/02/beyond_the_hype_ai_promises/\">Using AI to help</a> in the research process isn't anything new, and lots of boffins are open about utilizing AI to tweak experiments to achieve better results. However, using AI to actually write abstracts and other chunks of papers is very different, because the general expectation is that scientific articles are written by actual humans, not robots, and at least a couple of publishers consider <a target=\"_blank\" href=\"https://www.theregister.com/2023/01/27/top_academic_publisher_science_bans/\">using LLMs to write papers</a> to be scientific misconduct.</p>\n<p>Using AI models can be very risky as they often produce inaccurate text, the very thing scientific literature is not supposed to do. AI models can even fabricate quotations and citations, an occurrence that infamously got two New York attorneys <a target=\"_blank\" href=\"https://www.theregister.com/2024/02/24/chatgpt_cuddy_legal_fees/\">in trouble</a> for citing cases ChatGPT had dreamed up.</p>\n<p>\"Authors who are using LLM-generated text must be pressured to disclose this or to think twice about whether doing so is appropriate in the first place, as a matter of basic research integrity,\" University College London’s Gray opined.</p>\n<p>The Stanford researchers also raised similar concerns, writing that use of generative AI in scientific literature could create \"risks to the security and independence of scientific practice.\" ®</p> \n </div>",
"author": "",
"favicon": "https://www.theregister.com/design_picker/13249a2e80709c7ff2e57dd3d49801cd534f2094/graphics/favicons/favicon.svg",
"source": "theregister.com",
"published": "2024-05-03t10:28:12z",
"ttr": 129,
"type": "article"
}