Article · 10 min read

Are AI detectors actually reliable?

Last updated: May 2026

You paste in some text. The tool says "97% AI." What does that number actually mean? And should you trust it enough to fail a student, fire a writer, or pull an article? The short answer is no. The longer answer is what this piece is about.

If you've used an AI detector recently, you've probably noticed the same thing I have. You paste in some text. The tool thinks for a second. Then it gives you a percentage in big numbers, in a confident colour: green for human, red for AI. Sometimes there's a little progress bar. And that's the whole answer.

No explanation. No reasoning. No "here's what we noticed." Just the number.

This has caused real problems. There are documented cases of students failed for plagiarism based on detector scores, students who didn't actually use AI. Writers have lost jobs because something they wrote got flagged. Whole articles have been pulled from publication because the detector said so.

So it's worth asking honestly: how reliable are these tools? And if the answer is "less than they pretend to be," what should we do instead?

How AI detectors usually work

Most modern AI detectors fall into one of two camps.

The first is statistical. The tool looks at how predictable the text is. Given the first half of a sentence, how confidently could an AI model guess the second half? Human writing tends to make unexpected word choices. AI writing, optimised for plausibility, makes predictable ones. Tools like GPTZero use a metric called "perplexity" to measure this.

The problem with perplexity is that good writing is predictable too. A well-edited news article, a polished academic paper, a carefully written legal document. They all hit standard structures and standard vocabulary. They look statistically AI-like because they've been cleaned up to read smoothly. So perplexity flags lots of careful human writing as AI.

The second is pattern-based. The tool looks for specific words and structures that show up disproportionately in AI output. Words like "delve." Structures like the "it's not just X, it's Y" reframing. Sentence shapes that AI defaults to. This is closer to how a human reader spots AI: by noticing the small habits.

Pattern-based detection is more interpretable but it has the opposite failure mode. A human who happens to use one of these patterns gets flagged. Academic writers, who use a lot of formal vocabulary, get flagged constantly. Non-native English speakers, who often default to safer constructions, get flagged. Anyone trying to write "properly" trips the detector.

What that confident percentage actually represents

Here's the dirty secret. When a tool tells you "97% AI," that number isn't a probability in the way you'd expect.

It's not "97 out of 100 documents like this one were AI." It's not "we are 97% confident this is AI." It's a model's internal score, often a sigmoid output, sometimes calibrated, sometimes not. Two different detectors can score the same text 30% and 90%, and both will report their numbers with the same air of confidence.

OpenAI quietly retired their own AI Text Classifier in 2023 because, in their words, of "its low rate of accuracy." If the company that built ChatGPT couldn't make a reliable detector for ChatGPT's output, that should tell you something about the rest of the market.

The false positive problem

The most-cited paper on AI detection accuracy comes from a 2023 Stanford study that tested seven detectors on essays written by non-native English speakers. The results were striking. More than half the essays written by humans were flagged as AI-generated. Tools that worked well on native English writing fell apart on writing by anyone whose English was a second language.

This is the false positive problem, and it cuts in predictable directions:

The people who get flagged most by AI detectors are exactly the people who can least afford to be wrongly accused: students, ESL writers, people early in their careers. The detectors aren't neutral. They have a class and education bias baked in.

The false negative problem

The other side of the coin: AI detectors miss a lot of actual AI. This is partly because the models keep getting better. Newer GPT-4 and Claude outputs are much harder to detect than 2022-era GPT-3 output. It's also because anyone trying to evade detection can do so trivially.

Run your AI-written text through a "humaniser" tool. Paraphrase it. Translate it to another language and back. Add a few personal anecdotes. Replace half the em dashes with commas. Most detectors fail completely.

This creates a strange dynamic. Honest writers, who don't think about gaming the detector, get caught for sounding too clean. Dishonest AI users, who run their output through a paraphraser, sail right through. The tools are worst at exactly the people they're supposed to catch.

What detectors are actually good for

None of this means AI detection is useless. It means we have to be honest about what it's actually for.

Here's what detectors do well:

Here's what they're bad at:

How to think about detector results

If you're going to use AI detection tools, here's a healthier mental model than "this is X% AI":

  1. A detector result is a flag, not a verdict. A high score means "this is worth a second look." Nothing more.
  2. Pair the result with what you know about the writer. Does this read like their previous work? Does it match their voice? A detector can't answer these questions. You can.
  3. Look at the evidence, not just the score. A tool that just gives you a percentage is asking you to trust it blindly. A tool that shows you which patterns triggered and why lets you make your own judgement.
  4. Be especially careful with non-native English writers. The false positive rate is dramatically higher, for reasons that have nothing to do with AI use.
  5. Never use detector output as the sole basis for an accusation. If you can't make a case without the score, you don't have a case.
HOW WE HANDLE THIS

Telltale shows you every pattern it found, with examples and explanations. The score is there if you want it, but the evidence is the point. You can read the reasoning and decide for yourself whether it adds up.

→ Try Telltale

The bigger picture

AI detection is a moving target. Models are improving faster than detectors can keep up. The tools we have today will be less useful a year from now. The fundamental problem (telling apart text written by something that learned from human text, from text written by an actual human) gets harder as the models get better.

This doesn't mean we give up on detection. It means we stop pretending detection is a solved problem with a clean answer. The honest framing is: AI detectors are indicators, useful for raising questions, not for closing them. The closer they get to acting like search engines for AI tells (here's what we noticed, and why) and the further they get from acting like polygraphs (here's the truth, in a number), the more useful they actually are.

The future of AI detection isn't going to be a tool that tells you yes or no. It's going to be a workflow where detection is one input among several. Alongside knowing the writer, knowing the context, and being willing to ask before you accuse.

Further reading

Published May 2026 · telltale-ai.com
All articles · Back to tool · Privacy · Terms