"Bosom peril" is not "breast cancer": How weird computer-generated phrases help researchers find scientific publishing fraud -…

In 2020, despite the COVID pandemic, scientists authored 6 million peer-reviewed publications, a 10 percent increase compared to 2019. At first glance this big number seems like a good thing, a positive indicator of science advancing and knowledge spreading. Among these millions of papers, however, are thousands of fabricated articles, many from academics who feel compelled by a publish-or-perish mentality to produce, even if it means cheating.

But in a new twist to the age-old problem of academic fraud, modern plagiarists are making use of software and perhaps even emerging AI technologies to draft articlesand theyre getting away with it.

The growth in research publication combined with the availability of new digital technologies suggest computer-mediated fraud in scientific publication is only likely to get worse. Fraud like this not only affects the researchers and publications involved, but it can complicate scientific collaboration and slow down the pace of research. Perhaps the most dangerous outcome is that fraud erodes the publics trust in scientific research. Finding these cases is therefore a critical task for the scientific community.

We have been able to spot fraudulent research thanks in large part to one key tell that an article has been artificially manipulated: The nonsensical tortured phrases that fraudsters use in place of standard terms to avoid anti-plagiarism software. Our computer system, which we named the Problematic Paper Screener, searches through published science and seeks out tortured phrases in order to find suspect work. While this method works, as AI technology improves, spotting these fakes will likely become harder, raising the risk that more fake science makes it into journals.

What are tortured phrases? A tortured phrase is an established scientific concept paraphrased into a nonsensical sequence of words. Artificial intelligence becomes counterfeit consciousness. Mean square error becomes mean square blunder. Signal to noise becomes flag to clamor. Breast cancer becomes Bosom peril. Teachers may have noticed some of these phrases in students attempts to get good grades by using paraphrasing tools to evade plagiarism.

As of January 2022, weve found tortured phrases in 3,191 peer-reviewed articles published (and counting), including in reputable flagship publications. The two most frequent countries listed in the authors affiliations are India (71.2 percent) and China (6.3 percent). In one specific journal that had a high prevalence of tortured phrases, we also noticed the time between when an article was submitted and when it was accepted for publication declined from an average of 148 days in early 2020 to 42 days in early 2021. Many of these articles had authors affiliated with institutions in India and China, where the pressure to publish may be exceedingly high.

In China, for example, institutions have been documented to impose production targets that are nearly impossible to meet. Doctors affiliated with Chinese hospitals, for instance, have to get published to get promoted, but many are too busy in the hospital to do so.

Tortured phrases also star in lazy surveys of the literature: Someone copies abstracts from papers, paraphrases them, and pastes them in a document to form gibberish devoid of any meaning.

Our best guess for the source of tortured phrases is that authors are using automated paraphrasing toolsdozens can be easily found online. Crooked scientists are using these tools to copy text from various genuine sources, paraphrase them, and paste the tortured result into their own papers. How do we know this? A strong piece of evidence is that one can reproduce most tortured phrases by feeding established terms into paraphrasing software.

Using paraphrasing software can introduce factual errors. Replacing a word by its synonym in lay language may lead to a different scientific meaning. For example, in engineering literature, when accuracy replaces precision (or vice versa) different notions are mixed-up; the text is not only paraphrased but becomes wrong.

We also found published papers that appear to have been partly generated with AI language models like GPT-2, a system developed by OpenAI. Unlike papers where authors seem to have used paraphrasing software, which changes existing text, these AI models can produce text out of whole cloth.

While computer programs that can create science or math articles have been around for almost two decades (like SCIgen, a program developed by MIT graduate students in 2005 to create science papers, or Mathgen, which has been producing math papers since 2012), the newer AI language models present a thornier problem. Unlike the pure nonsense produced by Mathgen or SCIgen, the output of the AI systems is much harder to detect. For example, given the beginning of a sentence as a starting point, a model like GPT-2 can complete the sentence and even generate entire paragraphs. Some papers appear to be produced by these systems. We screened a sample of about 140,000 abstracts of papers published by Elsevier, an academic publisher, in 2021 with OpenAIs GPT-2 detector. Hundreds of suspect papers featuring synthetic text appeared in dozens of reputable journals.

AI could compound an existing problem in academic publishingthe paper mills that churn out articles for a priceby making paper mill fakes easier to produce and harder to suss out.

How we found tortured phrases. We spotted our first tortured phrase last spring while reviewing various papers for suspicious abnormalities, like evidence of citation gaming or references to predatory journals. Ever heard of profound neural organization? Computer scientists may recognize this as a distorted reference to a deep neural network. This led us to search for this phrase in the entire scientific literature where we found several other articles with the same bizarre language, some of which contained other tortured phrases, as well. Finding more and more articles with more and more tortured phrases (473 such phrases as of January 2022) we realized that the problem is big enough to be called out in public.

To track papers with tortured phrases, as well as meaningless papers produced by SCIgen or Mathgen (which have also made it into publications), we developed the Problematic Paper Screener. Behind the curtains, the software relies on open science tools to search for tortured phrases in scientific papers and to check whether others had already flagged issues. Finding problematic papers with tortured phrases has become a crowd effort, as researchers have used our software to find new phrases.

The problem of tortured phrases. Scientific editors and referees certainly reject buggy submissions with tortured phrases, but a fraction still evades their vigilance and gets published. This means, researchers could waste time filtering through published scams. Another problem is that interdisciplinary research could get bogged down by unreliable research, say, for example, if a public health expert wanted to collaborate with a computer scientist who published about a diagnostic tool in a fraudulent paper.

And as computers do more aggregating work, faulty articles could also jeopardize future AI-based research tools. For example, in 2019, the publisher Springer Nature used AI to analyze 1,086 publications and generate a handbook on lithium-ion batteries. The AI created coherent chapters and sections and succinct summaries of the articles. What if the source material for these sorts of projects were to include nonsensical, tortured publications?

The presence of this junk pseudo-scientific literature also undermines citizens trust in scientists and science, especially when it gets dragged into public policy debates.

Recently tortured phrases have even turned up in scientific literature on the COVID-19 pandemic. One paper published in July 2020, since retracted, was cited 52 times as of this month, despite mentioning the phrase extreme intense respiratory syndrome (SARS), which is clearly a reference to severe acute respiratory syndrome, the disease caused by the coronavirus SARS-CoV-1. Other papers contained the same tortured phrase.

Once fraudulent papers are found, getting them retracted is no easy task.

Editors and publishers who are members of the Committee on Publication Ethics must follow pre-established complex guidelines when they find problematic papers. But the process has a loophole. Publishers investigate the issue for months or years because they are supposed to wait for answers and explanations from authors for an undefined amount of time.

AI will help detect meaningless papers, erroneous ones, or those featuring tortured phrases. But this will be effective only in the short to medium term. AI checking tools could end up provoking an arms race in the longer term, when text-generating tools are pitted against those that detect artificial texts, potentially leading to ever-more-convincing fakes.

But there are few steps academia can take to address the problem of fraudulent papers.

Apart from a sense of achievement, there is no clear incentive for a reviewer to deliver a thoughtful critique of a submitted paper and no direct detrimental effect of peer-review performed carelessly. Incentivizing stricter checks during peer-review and once a paper is published will alleviate the problem. Promoting post-publication peer-review at PubPeer.com, where researchers can critique articles in an unofficial context, and encouraging other ways to engage the research community more broadly could shed light on suspicious science.

In our view the emergence of tortured phrases is a direct consequence of the publish-or-perish system. Scientists and policy makers need to question the intrinsic value of racking up high article counts as the most important career metric. Other production must be rewarded, including proper peer-reviews, data sets, preprints, and post-publication discussions. If we act now, we have a chance to pass a sustainable scientific environment onward to the future generations of researchers.

"Bosom peril" is not "breast cancer": How weird computer-generated phrases help researchers find scientific publishing fraud -...

University of California expands list of courses that meet math requirement for admission - EdSource [Last Updated On: November 11th, 2020] [Originally Added On: November 11th, 2020]
Bombshell Betty Race car to be Reengineered and Restored By UVU Students to honor the Legacy of its Owner - GlobeNewswire [Last Updated On: November 11th, 2020] [Originally Added On: November 11th, 2020]
Phyllis Coleman Mouton to receive Trailblazer Award at Women Who Mean Business ceremony - The Advocate [Last Updated On: November 11th, 2020] [Originally Added On: November 11th, 2020]
Fairfield University Partners with Pulse Secure on New Cybersecurity Lab to Prepare the Next Generation of Information Security Professionals -... [Last Updated On: November 11th, 2020] [Originally Added On: November 11th, 2020]
Global Cloud Identity and Access Management(IAM) Market Segmentation By Top Key Players- IBM Microsoft Oracle Computer Science CA Okta NetIQ Sailpoint... [Last Updated On: November 11th, 2020] [Originally Added On: November 11th, 2020]
Stanford supports alliance of universities in diversifying STEM postdocs - The Stanford Daily [Last Updated On: November 11th, 2020] [Originally Added On: November 11th, 2020]
N.C. A&T Welcomes New and Newly-Appointed Administrators and Faculty - Yes! Weekly [Last Updated On: November 11th, 2020] [Originally Added On: November 11th, 2020]
Calvin Students Place In Top 10% Of Worldwide Programming Competition - News - Calvin News [Last Updated On: November 11th, 2020] [Originally Added On: November 11th, 2020]
Multiple tenure-track positions in Computer Science & Engineering job with University of Minnesota-Twin Cities Computer Science & Engineering... [Last Updated On: November 11th, 2020] [Originally Added On: November 11th, 2020]
New smartwatch app alerts deaf and hard-of-hearing users to common home-related sounds - National Science Foundation [Last Updated On: November 11th, 2020] [Originally Added On: November 11th, 2020]
MTRAC Innovation Hub for Advanced Computing awards $270000 to Wayne State University artificial intelligence projects - The South End [Last Updated On: November 11th, 2020] [Originally Added On: November 11th, 2020]
New study outlines steps higher education should take to prepare a new quantum workforce | College of Science | RIT - RIT University News Services [Last Updated On: November 11th, 2020] [Originally Added On: November 11th, 2020]
Carleton Hosts Herzberg Lecture on Increasing Diversity in Computer Science with Maria Klawe - Carleton Newsroom [Last Updated On: November 11th, 2020] [Originally Added On: November 11th, 2020]
Baylor University Invites Application for McCollum Endowed Chair of Data Science - Analytics Insight [Last Updated On: November 11th, 2020] [Originally Added On: November 11th, 2020]
CHEN | Put Computer Science in the Common Core - Cornell University The Cornell Daily Sun [Last Updated On: November 11th, 2020] [Originally Added On: November 11th, 2020]
GCVI's Tremain running to the NCAA on scholarship - GuelphToday [Last Updated On: November 28th, 2020] [Originally Added On: November 28th, 2020]
Faculty, alumni, other members of U of T community named to Order of Canada - News@UofT [Last Updated On: November 28th, 2020] [Originally Added On: November 28th, 2020]
Why 4-year colleges are tapping Amazon to help deliver cloud computing degrees - Education Dive [Last Updated On: November 28th, 2020] [Originally Added On: November 28th, 2020]
Army Teams With Howard University on AI Center MeriTalk - MeriTalk [Last Updated On: November 28th, 2020] [Originally Added On: November 28th, 2020]
McGrath one of 10 women to earn STEM scholarship - The Riverdale Press [Last Updated On: November 28th, 2020] [Originally Added On: November 28th, 2020]
This learning platform is proving adults can benefit greatly from learning math and science - iMore [Last Updated On: November 28th, 2020] [Originally Added On: November 28th, 2020]
Artificial Intelligence Is Now Smart Enough to Know When It Can't Be Trusted - ScienceAlert [Last Updated On: November 28th, 2020] [Originally Added On: November 28th, 2020]
Students and schools in the news - Blue Springs Examiner [Last Updated On: November 28th, 2020] [Originally Added On: November 28th, 2020]
Missouri S&T News and Events Missouri S&T faculty honored for outstanding teaching - Missouri S&T News and Research [Last Updated On: November 28th, 2020] [Originally Added On: November 28th, 2020]
HCCC Offers Opportunities for Adjunct Faculty and Instructors at Virtual Job Fair - The Hudson Reporter [Last Updated On: November 28th, 2020] [Originally Added On: November 28th, 2020]
4-H ignites a passion for science and technology in Minnesota youth - Southernminn.com [Last Updated On: November 28th, 2020] [Originally Added On: November 28th, 2020]
MIT's New Center to Advance Predictive Simulation Research Will Focus on Exascale Simulation of Materials in Hypersonic Flow Environments -... [Last Updated On: November 28th, 2020] [Originally Added On: November 28th, 2020]
Computer scientist James Allen named AAAS fellow - University of Rochester [Last Updated On: November 28th, 2020] [Originally Added On: November 28th, 2020]
Center to advance predictive simulation research established at MIT Schwarzman College of Computing - MIT News [Last Updated On: November 28th, 2020] [Originally Added On: November 28th, 2020]
Setting the pace in computer science education | Opinion - Paragould Daily Press [Last Updated On: November 28th, 2020] [Originally Added On: November 28th, 2020]
Mohammed VI University in Benguerir Launches School of Computer Science - Morocco World News [Last Updated On: November 28th, 2020] [Originally Added On: November 28th, 2020]
Asa Hutchinson: Setting the pace in computer science education - Searcy Daily Citizen [Last Updated On: November 28th, 2020] [Originally Added On: November 28th, 2020]
Former FX tech person points out the racist trajectory of skin and hair CGI - Boing Boing [Last Updated On: December 11th, 2020] [Originally Added On: December 11th, 2020]
AI is not yet perfect, but it's on the rise and getting better with computer vision - TechRepublic [Last Updated On: December 11th, 2020] [Originally Added On: December 11th, 2020]
Philosophy Threatened at University of Evansville - Daily Nous [Last Updated On: December 11th, 2020] [Originally Added On: December 11th, 2020]
Two Maryland Teachers Receive National Honors in Math, Science Education - maryland.gov [Last Updated On: December 11th, 2020] [Originally Added On: December 11th, 2020]
Special Scientist Research, Department of Computer Science job with UNIVERSITY OF CYPRUS | 238208 - Times Higher Education (THE) [Last Updated On: December 11th, 2020] [Originally Added On: December 11th, 2020]
Computer science jobs pay well and are growing fast. Why are they out of reach for so many of America's students? - The Conversation US [Last Updated On: December 11th, 2020] [Originally Added On: December 11th, 2020]
Computer science grad finds success and a new academic family in cybersecurity - ASU Now [Last Updated On: December 11th, 2020] [Originally Added On: December 11th, 2020]
What is Computer Science? in the US - International Student [Last Updated On: December 11th, 2020] [Originally Added On: December 11th, 2020]
Accurate Neural Network Computer Vision Without The 'Black Box' - Duke Today [Last Updated On: December 19th, 2020] [Originally Added On: December 19th, 2020]
Crick Named Mathematical Sciences Distinguished Alumnus Of The Year - The Chattanoogan [Last Updated On: December 19th, 2020] [Originally Added On: December 19th, 2020]
Nadya's Hot Chocolate Bombs: yummy for the tummy - theday.com [Last Updated On: December 19th, 2020] [Originally Added On: December 19th, 2020]
Trouble hearing in a crowded room? New 'cone of silence' could help - Science Magazine [Last Updated On: December 19th, 2020] [Originally Added On: December 19th, 2020]
James Fujimoto wins the Visionary Prize from the Greenberg Prize to End Blindness - MIT News [Last Updated On: December 19th, 2020] [Originally Added On: December 19th, 2020]
To the brain, reading computer code is not the same as reading language - MIT News [Last Updated On: December 19th, 2020] [Originally Added On: December 19th, 2020]
U of Texas will stop using controversial algorithm to evaluate Ph.D. applicants - Inside Higher Ed [Last Updated On: December 19th, 2020] [Originally Added On: December 19th, 2020]
Gift from Ann S. Bowers '59 creates new college of computing and information science | Cornell Chronicle - Cornell Chronicle [Last Updated On: December 19th, 2020] [Originally Added On: December 19th, 2020]
NYS Board of Regents adopts first-ever learning standards for computer science and digital fluency - RochesterFirst [Last Updated On: December 19th, 2020] [Originally Added On: December 19th, 2020]
Computer science prof Townsend recognized for educational contributions - DePauw University [Last Updated On: December 19th, 2020] [Originally Added On: December 19th, 2020]
Missouri S&T News and Events New faculty in computer science - Missouri S&T News and Research [Last Updated On: December 19th, 2020] [Originally Added On: December 19th, 2020]
Retired UW computer science professor embroiled in Twitter spat over AI ethics and cancel culture - GeekWire [Last Updated On: December 19th, 2020] [Originally Added On: December 19th, 2020]
How UC fought COVID-19 in 2020 - University of California [Last Updated On: December 23rd, 2020] [Originally Added On: December 23rd, 2020]
Search committee appointed for dean of Princeton's School of Public and International Affairs - Princeton University [Last Updated On: December 23rd, 2020] [Originally Added On: December 23rd, 2020]
How Yale economists are informing India's COVID-19 response - Yale News [Last Updated On: December 23rd, 2020] [Originally Added On: December 23rd, 2020]
Top MIT research stories of 2020 - MIT News [Last Updated On: December 23rd, 2020] [Originally Added On: December 23rd, 2020]
St. Albans City School kids were 'on the case' for Computer Science Week. What mystery did they solve? - St. Albans Messenger [Last Updated On: December 23rd, 2020] [Originally Added On: December 23rd, 2020]
Cobb Schools receives grant for computer science teacher training - The Catoosa County News [Last Updated On: December 23rd, 2020] [Originally Added On: December 23rd, 2020]
Scholarship honors the legacy of Terry Arthur's dedication to students - Augusta Free Press [Last Updated On: December 24th, 2020] [Originally Added On: December 24th, 2020]
This tool helps predict which COVID patients will need hospitalization and which can be sent home - Press-Enterprise [Last Updated On: December 24th, 2020] [Originally Added On: December 24th, 2020]
Students express concerns over teaching appointment of Jason Mars - The Michigan Daily [Last Updated On: December 24th, 2020] [Originally Added On: December 24th, 2020]
Prince Mohammad Bin Fahd University hosted the International Conference on Computing, Mobility, and Manufacturing (CMM 2020) - PRNewswire [Last Updated On: January 10th, 2021] [Originally Added On: January 10th, 2021]
These Are the College Majors That Pay Off the Most - 24/7 Wall St. [Last Updated On: January 10th, 2021] [Originally Added On: January 10th, 2021]
He Was Going to Close the Family Diner. Then He Got a Sign. - The New York Times [Last Updated On: January 10th, 2021] [Originally Added On: January 10th, 2021]
Members of Several Well-Known Hate Groups Identified at Capitol Riot - FRONTLINE [Last Updated On: January 10th, 2021] [Originally Added On: January 10th, 2021]
Carver Community Center to offer free pampers to mothers, free coding classes for youth - Marshall News Messenger [Last Updated On: January 10th, 2021] [Originally Added On: January 10th, 2021]
MIT's College of Computing building takes shape as Alexandria and BioMed make moves in Boston - Cambridge Day [Last Updated On: January 10th, 2021] [Originally Added On: January 10th, 2021]
Bylaws of the Department of Computer Science and Engineering - Nevada Today [Last Updated On: January 10th, 2021] [Originally Added On: January 10th, 2021]
Student-run HPAIR conference goes virtual this year - Harvard Gazette [Last Updated On: January 16th, 2021] [Originally Added On: January 16th, 2021]
JUST IN: Computer scientists in breakthrough - The Herald [Last Updated On: January 16th, 2021] [Originally Added On: January 16th, 2021]
Optimizing Traffic Signals To Reduce Intersection Wait Times - Texas A&M University Today [Last Updated On: January 16th, 2021] [Originally Added On: January 16th, 2021]
STEM Majors: Interested in a 1-Credit Course About Teaching Math, Science or Computer Science? - University of Arkansas Newswire [Last Updated On: January 16th, 2021] [Originally Added On: January 16th, 2021]
Stanford AI scholar Fei-Fei Li writes about humility in tech - Fast Company [Last Updated On: January 16th, 2021] [Originally Added On: January 16th, 2021]
Professor in Computer Science - The Voice Online [Last Updated On: January 16th, 2021] [Originally Added On: January 16th, 2021]
Expansion project to grow computer science learning, research at Algoma University - Northern Ontario Business [Last Updated On: January 31st, 2021] [Originally Added On: January 31st, 2021]
Teacher of Year finalist expanding Walden Grove computer science program - KGUN [Last Updated On: January 31st, 2021] [Originally Added On: January 31st, 2021]
Here's why you should get a master's in computer science - Study International News [Last Updated On: January 31st, 2021] [Originally Added On: January 31st, 2021]
Two UWF teams place in top 5 in national artificial intelligence competition - University of West Florida Newsroom - UWF Newsroom [Last Updated On: February 5th, 2021] [Originally Added On: February 5th, 2021]
WNMU Board of Regents Virtually Sits Down With Legislators, Governor - WNMU News [Last Updated On: February 5th, 2021] [Originally Added On: February 5th, 2021]
Department name change signals broad impact on computer and information technologies - Princeton University [Last Updated On: February 5th, 2021] [Originally Added On: February 5th, 2021]

Cloud Hosting

"Bosom peril" is not "breast cancer": How weird computer-generated phrases help researchers find scientific publishing fraud -…

Recent Posts

Categories

Archives

Media Sites

Pages

Site admin