Enlarge / An AI-generated image of James Madison writing the US Constitution using AI.
Midjourney / Benj Edwards
If you feed America's most important legal documentthe US Constitutioninto a tooldesigned to detect text written by AI models like ChatGPT, it will tell you that the document was almost certainly written by AI. But unless James Madison was a time traveler, that can't be the case. Why do AI writing detection tools give false positives? We spoke to several expertsand the creator of AI writing detector GPTZeroto find out.
Among news stories of overzealous professors flunking an entire class due to the suspicion of AI writing tool use and kids falsely accused of using ChatGPT, generative AI has education in a tizzy. Some think it represents an existential crisis. Teachers relying on educational methods developed over the past century have been scrambling for ways to keep the status quothe tradition of relying on the essay as a tool to gauge student mastery of a topic.
As tempting as it is to rely on AI tools to detect AI-generated writing, evidence so far has shown that they are not reliable. Due to false positives, AI writing detectors such as GPTZero, ZeroGPT, and OpenAI's Text Classifier cannot be trusted to detect text composed by large language models (LLMs) like ChatGPT.
A viral screenshot from April 2023 showing GPTZero saying, "Your text is likely to be written entirely by AI" when fed part of the US Constitution.
Ars Technica
When fed part of the US Constitution, ZeroGPT says, "Your text is AI/GPT Generated."
Ars Technica
When fed part of the US Constitution, OpenAI's Text Classifier says, "The classifier considers the text to be unclear if it is AI-generated."
Ars Technica
If you feed GPTZero a section of the US Constitution, it says the text is "likely to be written entirely by AI." Several times over the past six months, screenshots of other AI detectors showing similar results have gone viral on social media, inspiring confusion and plenty of jokes about the founding fathers being robots. It turns out the same thing happens with selections from The Bible, which also show up as being AI-generated.
To explain why these tools make such obvious mistakes (and otherwise often return false positives), we first need to understand how they work.
Different AI writing detectors use slightly different methods of detection but with a similar premise: There's an AI model that has been trained on a large body of text (consisting of millions of writing examples) and a set of surmised rules that determine whether the writing is more likely to be human- or AI-generated.
For example, at the heart of GPTZero is a neural network trained on "a large, diverse corpus of human-written and AI-generated text, with a focus on English prose," according to the service's FAQ. Next, the system uses properties like "perplexity" and burstiness" to evaluate the text and make its classification.
Bonnie Jacobs / Getty Images
In machine learning, perplexity is a measurement of how much a piece of text deviates from what an AI model has learned during its training. As Dr. Margaret Mitchell of AI company Hugging Face told Ars, "Perplexity is a function of 'how surprising is this language based on what I've seen?'"
So the thinking behind measuring perplexity is that when they're writing text, AI models like ChatGPT will naturally reach for what they know best, which comes from their training data. The closer the output is to the training data, the lower the perplexity rating. Humans are much more chaotic writersor at least that's the theorybut humans can write with low perplexity, too, especially when imitating a formal style used in law or certain types of academic writing. Also, many of the phrases we use are surprisingly common.
Let's say we're guessing the next word in the phrase "I'd like a cup of _____." Most people would fill in the blank with "water," "coffee," or "tea." A language model trained on a lot of English text would do the same because those phrases occur frequently in English writing. The perplexity of any of those three results would be quite low because the prediction is fairly certain.
Now consider a less common completion: "I'd like a cup of spiders." Both humans and a well-trained language model would be quite surprised (or "perplexed") by this sentence, so its perplexity would be high. (As of this writing, the phrase "I'd like a cup of spiders" gives exactly one result in a Google search, compared to 3.75 million results for "I'd like a cup of coffee.")
Ars Technica
If the language in a piece of text isn't surprising based on the model's training, the perplexity will be low, so the AI detector will be more likely to classify that text as AI-generated. This leads us to the interesting case of the US Constitution. In essence, the Constitution's language is so ingrained in these models that they classify it as AI-generated, creating a false positive.
GPTZero creator Edward Tian told Ars Technica, "The US Constitution is a text fed repeatedly into the training data of many large language models. As a result, many of these large language models are trained to generate similar text to the Constitution and other frequently used training texts. GPTZero predicts text likely to be generated by large language models, and thus this fascinating phenomenon occurs."
The problem is that it's entirely possible for human writers to create content with low perplexity as well (if they write primarily using common phrases such as "I'd like a cup of coffee," for example), which deeply undermines the reliability of AI writing detectors.
Ars Technica
Another property of text measured by GPTZero is "burstiness," which refers to the phenomenon where certain words or phrases appear in rapid succession or "bursts" within a text. Essentially, burstiness evaluates the variability in sentence length and structure throughout a text.
Human writers often exhibit a dynamic writing style, resulting in text with variable sentence lengths and structures. For instance, we might write a long, complex sentence followed by a short, simple one, or we might use a burst of adjectives in one sentence and none in the next. This variability is a natural outcome of human creativity and spontaneity.
AI-generated text, on the other hand, tends to be more consistent and uniformat least so far. Language models, which are still in their infancy, generate sentences with more regular lengths and structures. This lack of variability can result in a low burstiness score, indicating that the text may be AI-generated.
However, burstiness isn't a foolproof metric for detecting AI-generated content, either. As with perplexity, there are exceptions. A human writer may write in a highly structured, consistent style, resulting in a low burstiness score. Conversely, an AI model might be trained to emulate a more human-like variability in sentence length and structure, raising its burstiness score. In fact, as AI language models improve, studies show that their writing looks more and more like human writing all the time.
Ultimately, there's no magic formula that can always distinguish human-written text from that composed by a machine. AI writing detectors can make a strong guess, but the margin of error is too large to rely on them for an accurate result.
A 2023 study from researchers at the University of Maryland demonstrated empirically that detectors for AI-generated text are not reliable in practical scenarios and that they perform only marginally better than a random classifier. Not only do they return false positives, but detectors and watermarking schemes (that seek to alter word choice in a telltale way) can easily be defeated by "paraphrasing attacks" that modify language model output while retaining its meaning.
"I think they're mostly snake oil," said AI researcher Simon Willison of AI detector products. "Everyone desperately wants them to workpeople in education especiallyand it's easy to sell a product that everyone wants, especially when it's really hard to prove if it's effective or not."
Additionally, a recent study from Stanford University researchers showed that AI writing detection is biased against non-native English speakers, throwing out high false-positive rates for their human-written work and potentially penalizing them in the global discourse if AI detectors become widely used.
Some educators, like Professor Ethan Mollick of Wharton School, are accepting this new AI-infused reality and even actively promoting the use of tools like ChatGPT to aid learning. Mollick's reaction is reminiscent of how some teachers dealt with the introduction of pocket calculators into classrooms: They were initially controversialbut eventually came to be widely accepted.
"There is no tool that can reliably detect ChatGPT-4/Bing/Bard writing," Mollick tweeted recently. "The existing tools are trained on GPT-3.5, they have high false positive rates (10%+), and they are incredibly easy to defeat." Additionally, ChatGPT itself cannot assess whether text is AI-written or not, he added, so you can't just paste in text and ask if it was written by ChatGPT.
Midjourney
In a conversation with Ars Technica, GPTZero's Tian seemed to see the writing on the wall and said he plans to pivot his company away from vanilla AI detection into something more ambiguous. "Compared to other detectors, like Turn-it-in, we're pivoting away from building detectors to catch students, and instead, the next version of GPTZero will not be detecting AI but highlighting what's most human, and helping teachers and students navigate together the level of AI involvement in education," he said.
How does he feel about people using GPTZero to accuse students of academic dishonesty? Unlike traditional plagiarism checker companies, Tian said, "We don't want people using our tools to punish students. Instead, for the education use case, it makes much more sense to stop relying on detection on the individual level (where some teachers punish students and some teachers are fine with AI technologies) but to apply these technologies on the school [or] school board [level], even across the country, because how can we craft the right policies to respond to students using AI technologies until we understand what is going on, and the degree of AI involvement across the board?"
Yet despite the inherent problems with accuracy, GPTZero still advertises itself as being "built for educators," and its site proudly displays a list of universities that supposedly use the technology. There's a strange tension between Tian's stated goals not to punish students and his desire to make money with his invention. But whatever the motives, using these flawed products can have terrible effects on students. Perhaps the most damaging result of people using these inaccurate and imperfect tools is the personal cost of false accusations.
Ars Technica
A case reported by USA Today highlights the issue in a striking way. A student was accused of cheating based on AI text detection tools and had to present his case before an honor board. His defense included showing his Google Docs history to demonstrate his research process. Despite the board finding no evidence of cheating, the stress of preparing to defend himself led the student to experience panic attacks. Similar scenarios have played out dozens (if not hundreds) of times across the US and are commonly documented on desperate Reddit threads.
Common penalties for academic dishonesty often include failing grades, academic probation, suspension, or even expulsion, depending on the severity and frequency of the violation. That's a difficult charge to face, and the use of flawed technology to levy those charges feels almost like a modern-day academic witch hunt.
In light of the high rate of false positives and the potential to punish non-native English speakers unfairly, it's clear that the science of detecting AI-generated text is far from foolproofand likely never will be. Humans can write like machines, and machines can write like humans. A more helpful question might be: Do humans who write with machine assistance understand what they are saying? If someone is using AI tools to fill in factual content in a way they don't understand, that should be easy enough to figure out by a competent reader or teacher.
AI writing assistance is here to stay, and if used wisely, AI language models can potentially speed up composition in a responsible and ethical way. Teachers may want to encourage responsible use and ask questions like: Does the writing reflect the intentions and knowledge of the writer? And can the human author vouch for every fact included?
A teacher who is also a subject matter expert could quiz students on the contents of their work afterward to see how well they understand it. Writing is not just a demonstration of knowledge but a projection of a person's reputation, and if the human author can't stand by every fact represented in the writing, AI assistance has not been used appropriately.
Like any tool, language models can be used poorly or used with skill. And that skill also depends on context: You can paint an entire wall with a paintbrush or create the Mona Lisa. Both scenarios are an appropriate use of the tool, but each demands different levels of human attention and creativity. Similarly, some rote writing tasks (generating standardized weather reports, perhaps) may be accelerated appropriately by AI, while more intricate tasks need more human care and attention. There's no black-or-white solution.
For now, Ethan Mollick told Ars Technica that despite panic from educators, he isn't convinced that anyone should use AI writing detectors. "I am not a technical expert in AI detection," Mollick said. "I can speak from the perspective of an educator working with AI to say that, as of now, AI writing is undetectable and likely to remain so, AI detectors have high false positive rates, and they should not be used as a result."
Read the original here:
Why AI detectors think the US Constitution was written by AI - Ars Technica
- Shell to use new AI technology in deep sea oil exploration - Reuters [Last Updated On: May 21st, 2023] [Originally Added On: May 21st, 2023]
- Tom Hanks: I could appear in movies after death with AI technology - BBC [Last Updated On: May 21st, 2023] [Originally Added On: May 21st, 2023]
- Why C3.ai, Palantir, and Other AI Stocks Soared This Week - The Motley Fool [Last Updated On: May 21st, 2023] [Originally Added On: May 21st, 2023]
- How to do the AI Webtoon filter going viral on TikTok - Dexerto [Last Updated On: May 21st, 2023] [Originally Added On: May 21st, 2023]
- AI poses risk to humanity, according to majority of Americans in new poll - Ars Technica [Last Updated On: May 21st, 2023] [Originally Added On: May 21st, 2023]
- New AI tool predicts Parkinson's disease with 96% accuracy -- 15 ... - Study Finds [Last Updated On: May 21st, 2023] [Originally Added On: May 21st, 2023]
- AI is in a 'baby bubble.' Here's what could burst it. - Markets Insider [Last Updated On: May 21st, 2023] [Originally Added On: May 21st, 2023]
- Rise of the machines: how long before AI steals my job? - Mexico News Daily [Last Updated On: May 21st, 2023] [Originally Added On: May 21st, 2023]
- Amazon is focusing on using A.I. to get stuff delivered to you faster - CNBC [Last Updated On: May 21st, 2023] [Originally Added On: May 21st, 2023]
- Beijing calls on cloud providers to support AI firms - TechCrunch [Last Updated On: May 21st, 2023] [Originally Added On: May 21st, 2023]
- AI at warp speed: disruption, innovation, and whats at stake - Economic Times [Last Updated On: May 21st, 2023] [Originally Added On: May 21st, 2023]
- How a family is using AI to plan a trip around the world - Business Insider [Last Updated On: May 21st, 2023] [Originally Added On: May 21st, 2023]
- Prompt Injection: An AI-Targeted Attack - Hackaday [Last Updated On: May 21st, 2023] [Originally Added On: May 21st, 2023]
- WHO calls for safe and ethical AI for health - World Health Organization [Last Updated On: May 21st, 2023] [Originally Added On: May 21st, 2023]
- Azeem on AI: Where Will the Jobs Come from After AI? - HBR.org Daily [Last Updated On: May 21st, 2023] [Originally Added On: May 21st, 2023]
- AI runs amok in 1st trailer for director Gareth Edwards' 'The Creator ... - Space.com [Last Updated On: May 21st, 2023] [Originally Added On: May 21st, 2023]
- From railroads to AI: Why new tech is often demonised - The Indian Express [Last Updated On: May 21st, 2023] [Originally Added On: May 21st, 2023]
- How Generative AI Changes Organizational Culture - HBR.org Daily [Last Updated On: May 21st, 2023] [Originally Added On: May 21st, 2023]
- Google plans to use new A.I. models for ads and to help YouTube creators, sources say - CNBC [Last Updated On: May 21st, 2023] [Originally Added On: May 21st, 2023]
- A.I. and sharing economy: UBER, DASH can boost profits investing ... - CNBC [Last Updated On: May 21st, 2023] [Originally Added On: May 21st, 2023]
- A Wharton professor says AI is like an 'intern' who 'lies a little bit' to make their bosses happy - Yahoo Finance [Last Updated On: May 21st, 2023] [Originally Added On: May 21st, 2023]
- CNET Published AI-Generated Stories. Then Its Staff Pushed Back - WIRED [Last Updated On: May 21st, 2023] [Originally Added On: May 21st, 2023]
- AI-Driven Robots Have Started Changing Tires In The U.S. In Half The Time As Humans - CarScoops [Last Updated On: May 21st, 2023] [Originally Added On: May 21st, 2023]
- Elections in UK and US at risk from AI-driven disinformation, say experts - The Guardian [Last Updated On: May 21st, 2023] [Originally Added On: May 21st, 2023]
- Here's What AI Thinks an Illinoisan Looks Like And Apparently, Real Illinoisans Agree - NBC Chicago [Last Updated On: May 21st, 2023] [Originally Added On: May 21st, 2023]
- We Put Google's New AI Writing Assistant to the Test - WIRED [Last Updated On: May 21st, 2023] [Originally Added On: May 21st, 2023]
- 'Heart wrenching': AI expert details dangers of deepfakes and tools to detect manipulated content - Fox News [Last Updated On: May 21st, 2023] [Originally Added On: May 21st, 2023]
- From Amazon to Wendy's, how 4 companies plan to incorporate AIand how you may interact with it - CNBC [Last Updated On: May 21st, 2023] [Originally Added On: May 21st, 2023]
- Meta Made Its AI Tech Open-Source. Rivals Say Its a Risky Decision. - The New York Times [Last Updated On: May 21st, 2023] [Originally Added On: May 21st, 2023]
- For chemists, the AI revolution has yet to happen - Nature.com [Last Updated On: May 23rd, 2023] [Originally Added On: May 23rd, 2023]
- G7 calls for adoption of international technical standards for AI - Reuters [Last Updated On: May 23rd, 2023] [Originally Added On: May 23rd, 2023]
- Bloomsbury admits using AI-generated artwork for Sarah J Maas novel - The Guardian [Last Updated On: May 23rd, 2023] [Originally Added On: May 23rd, 2023]
- New AI research lets you click and drag images to manipulate them ... - The Verge [Last Updated On: May 23rd, 2023] [Originally Added On: May 23rd, 2023]
- France makes high-profile push to be the A.I. hub of Europe setting up challenge to U.S., China - CNBC [Last Updated On: June 23rd, 2023] [Originally Added On: June 23rd, 2023]
- German tabloid Bild cuts 200 jobs and says some roles will be replaced by AI - The Guardian [Last Updated On: June 23rd, 2023] [Originally Added On: June 23rd, 2023]
- How Christopher Nolan Learned to Stop Worrying and Love AI - WIRED [Last Updated On: June 23rd, 2023] [Originally Added On: June 23rd, 2023]
- OpenAI plans app store for AI software, The Information reports - Reuters.com [Last Updated On: June 23rd, 2023] [Originally Added On: June 23rd, 2023]
- A.I. could remove all human touchpoints in supply chains. Heres what that means - CNBC [Last Updated On: June 23rd, 2023] [Originally Added On: June 23rd, 2023]
- Cision Announces Code of Ethics for AI Development and Support ... - PR Newswire [Last Updated On: June 23rd, 2023] [Originally Added On: June 23rd, 2023]
- AI Stock Price Prediction: Is C3.ai Really Worth $16? - InvestorPlace [Last Updated On: June 23rd, 2023] [Originally Added On: June 23rd, 2023]
- Is Applied Digital (APLD) Stock the Next Big AI Play? - InvestorPlace [Last Updated On: June 23rd, 2023] [Originally Added On: June 23rd, 2023]
- 2 Cloud Stocks to Ride the AI Opportunity - The Motley Fool [Last Updated On: June 23rd, 2023] [Originally Added On: June 23rd, 2023]
- Digital health funding this week: Outbound AI, Aledade, Dexcare - Modern Healthcare [Last Updated On: June 23rd, 2023] [Originally Added On: June 23rd, 2023]
- The AI Tool That Beat Out Top Wall Street Analysts - InvestorPlace [Last Updated On: June 23rd, 2023] [Originally Added On: June 23rd, 2023]
- Replacing news editors with AI is a worry for misinformation, bias ... - The Conversation [Last Updated On: June 23rd, 2023] [Originally Added On: June 23rd, 2023]
- In new AI hype frenzy, tech is applying the label to everything now - Axios [Last Updated On: June 23rd, 2023] [Originally Added On: June 23rd, 2023]
- How AI like ChatGPT could be used to spark a pandemic - Vox.com [Last Updated On: June 23rd, 2023] [Originally Added On: June 23rd, 2023]
- 70% of Companies Will Use AI by 2030 -- These 2 Stocks Have a ... - The Motley Fool [Last Updated On: June 23rd, 2023] [Originally Added On: June 23rd, 2023]
- Why C3.ai Stock Crashed by 10% on Friday - The Motley Fool [Last Updated On: June 23rd, 2023] [Originally Added On: June 23rd, 2023]
- YouTube integrates AI-powered dubbing tool - TechCrunch [Last Updated On: June 23rd, 2023] [Originally Added On: June 23rd, 2023]
- AINsight: Now Everywhere, Can AI Improve Aviation Safety? - Aviation International News [Last Updated On: June 23rd, 2023] [Originally Added On: June 23rd, 2023]
- What is 'ethical AI' and how can companies achieve it? - The Ohio State University News [Last Updated On: June 23rd, 2023] [Originally Added On: June 23rd, 2023]
- US to launch working group on generative AI, address its risks - Reuters.com [Last Updated On: June 23rd, 2023] [Originally Added On: June 23rd, 2023]
- Amazon Wants to Teach Its Cloud Customers About AI, and It's Yet ... - The Motley Fool [Last Updated On: June 23rd, 2023] [Originally Added On: June 23rd, 2023]
- HIMSSCast: When AI is involved in decision making, how does man ... - Healthcare IT News [Last Updated On: June 23rd, 2023] [Originally Added On: June 23rd, 2023]
- How AI could transform the legal industry for the better - Marketplace [Last Updated On: June 23rd, 2023] [Originally Added On: June 23rd, 2023]
- Neuroscience, Artificial Intelligence, and Our Fears: A Journey of ... - Neuroscience News [Last Updated On: June 23rd, 2023] [Originally Added On: June 23rd, 2023]
- Why SoundHound AI Stock Was Making So Much Noise This Week - The Motley Fool [Last Updated On: June 23rd, 2023] [Originally Added On: June 23rd, 2023]
- Advertisers should beware being too creative with AI - Financial Times [Last Updated On: June 23rd, 2023] [Originally Added On: June 23rd, 2023]
- 3 Top AI Stocks to Buy Right Now - The Motley Fool [Last Updated On: June 23rd, 2023] [Originally Added On: June 23rd, 2023]
- 1 AI Stock That Could Take You to Easy Street -- and 1 That Could ... - The Motley Fool [Last Updated On: June 23rd, 2023] [Originally Added On: June 23rd, 2023]
- Generative AI To Wearable Plant Sensors: New Report Lists Top 10 Emerging Tech Of 2023 - NDTV [Last Updated On: June 26th, 2023] [Originally Added On: June 26th, 2023]
- Researchers use AI to help save a woodpecker species in decline - MPR News [Last Updated On: June 26th, 2023] [Originally Added On: June 26th, 2023]
- OceanGate fires a whistleblower, hackers threaten to leak Reddit data, and Marvel embraces AI art - TechCrunch [Last Updated On: June 26th, 2023] [Originally Added On: June 26th, 2023]
- As AI Spreads, Experts Predict the Best and Worst Changes in ... - Pew Research Center [Last Updated On: June 26th, 2023] [Originally Added On: June 26th, 2023]
- 9 AI-powered tools for empowering CFOs unveiled at Health Magazine round table - Gulf News [Last Updated On: June 26th, 2023] [Originally Added On: June 26th, 2023]
- Translating Japanese, finding rap rhymes: How these young Toronto-area workers are using AI - Toronto Star [Last Updated On: June 26th, 2023] [Originally Added On: June 26th, 2023]
- Artificial Intelligence in Asset Management Market to grow by USD 10,373.18 million from 2022 to 2027, Growing adoption of cloud-based artificial... [Last Updated On: June 26th, 2023] [Originally Added On: June 26th, 2023]
- 5 Stocks Well-Positioned to Reap Rewards of AI: Morgan Stanley - Business Insider [Last Updated On: June 26th, 2023] [Originally Added On: June 26th, 2023]
- ChatGPT-maker OpenAI planning to launch marketplace for AI applications - Business Today [Last Updated On: June 26th, 2023] [Originally Added On: June 26th, 2023]
- AI watch: from Wimbledon to job losses in journalism - The Guardian [Last Updated On: June 26th, 2023] [Originally Added On: June 26th, 2023]
- AWS is investing $100 million in generative A.I. center in race to keep up with Microsoft and Google - CNBC [Last Updated On: June 26th, 2023] [Originally Added On: June 26th, 2023]
- Bets on A.I. and innovation help this tech-focused T. Rowe Price ... - CNBC [Last Updated On: June 26th, 2023] [Originally Added On: June 26th, 2023]
- Generation AI: It is Indias time to play chief disruptor | Mint - Mint [Last Updated On: June 26th, 2023] [Originally Added On: June 26th, 2023]
- The Next Token of Progress: 4 Unlocks on the Generative AI Horizon - Andreessen Horowitz [Last Updated On: June 26th, 2023] [Originally Added On: June 26th, 2023]
- MongoDB Embraces AI & Reduces Developer Friction With New Features - Forbes [Last Updated On: June 26th, 2023] [Originally Added On: June 26th, 2023]
- Why smart AI regulation is vital for innovation and US leadership - TechCrunch [Last Updated On: June 26th, 2023] [Originally Added On: June 26th, 2023]
- WEDNESDAY: West Seattle facilitator hosting 'civic conversation ... - West Seattle Blog [Last Updated On: June 26th, 2023] [Originally Added On: June 26th, 2023]
- A.I. has a discrimination problem. In banking, the consequences can be severe - CNBC [Last Updated On: June 26th, 2023] [Originally Added On: June 26th, 2023]
- AI Consciousness: An Exploration of Possibility, Theoretical ... - Unite.AI [Last Updated On: June 26th, 2023] [Originally Added On: June 26th, 2023]