Almost every day, a new story surfaces about the delights or perils of AI—it’s either going to revolutionize our lives or end human civilization. Or maybe both.
But putting the hype aside, we are where we are, as the experts say. And when it comes to AI apps, the two major players being adopted by the masses are AI content generators and AI content detectors. Here, I’ll dive into the latter to see whether these AI content detection apps can outsmart their AI-generating cousins.
Spoiler alert: as you might expect, both have flaws. But that’s ok. This technology is changing all the time, and both AI content detectors and the engines creating the AI content they’re detecting will improve. It’s just a case of whether the detectors can keep pace with the generators.
I spent considerable time testing loads of AI content detectors to narrow it down to the six best apps. Here they are.
The 6 best AI content detectors
What makes the best AI content detector?
How we evaluate and test apps
Our best apps roundups are written by humans who’ve spent much of their careers using, testing, and writing about software. Unless explicitly stated, we spend dozens of hours researching and testing apps, using each app as it’s intended to be used and evaluating it against the criteria we set for the category. We’re never paid for placement in our articles from any app or for links to any site—we value the trust readers put in us to offer authentic evaluations of the categories and apps we review. For more details on our process, read the full rundown of how we select apps to feature on the Zapier blog.
There are plenty of lists of the best AI content detectors, so what makes this one different? For starters, lots of lists are looking at apps that “detect and reword AI-sounding content.” But I’m not looking for a strange combo of detection and generation so people can churn out AI-generated content that goes undetected. Instead, my focus is on AI content detectors that help you identify AI content—whether you’re a teacher, a content manager, or anyone else who wants to be sure humans are producing the content you’re reading.
Also, I didn’t just read these apps’ marketing materials and customer reviews. I spent dozens of hours researching and testing the best AI content detectors.
So, how do you test for AI-generated content? Well, rightly or wrongly, here’s what I did.
-
First, I decided on a topic to test. I wanted something I knew was 100% human, so I chose a previous article I’d written: How to change your passwords in 6 steps.
-
Next, I needed some AI-generated content on the same topic, so I asked ChatGPT (V3.5) and Claude (V3 Sonnet) to write an article using the prompt: “Write a 1,500-word article on ‘how to change your passwords.'” (Admittedly, neither app was keen to write that much—it took a bit of coaxing—but in the end, we got somewhere near the mark. As it turned out, some of the content detectors wouldn’t let you enter that much content, so I ended up making each piece around 700 words to ensure each tool had the same test.)
-
Finally, I created a piece of mixed content by using the start of my human article and ending with a portion of the ChatGPT text.
So, in the end, I tested each app on four pieces of text: Human, ChatGPT, Claude, and Mixed.
As I was testing the apps, here’s what I was looking for:
-
Ease of use: How easy is the tool to use? Are there any restrictions that might be too prohibitive?
-
Accuracy: How well does the tool detect AI-generated content? The best AI content detectors should have minimal false positives and negatives and offer somewhat reliable results. It’s a fast-moving landscape, but I wanted to have at least 75% confidence in the results. (A low threshold? Maybe. But that’s where we are right now.)
-
Interpretability: Closely linked to accuracy is interpretability. For example, can the app detect AI content from multiple LLMs (e.g., GPT, Gemini, Llama, Claude, Falcon), differentiate between AI, human, and mixed (AI + human) content, or produce sentence-level AI highlighting and reporting?
-
Extra features: I looked for any additional functionality, such as a browser extension, a plagiarism checker, an API, or integrations with other tools like Google Docs, Microsoft Word, Canvas, Blackboard, or other classroom applications and LMS platforms.
-
Scalability: Finally, you’ll want to know how much content the app can detect without compromising accuracy and remain affordable. In other words, is the tool good for 1,500 words tops, or can it analyze larger volumes?
Overall, I whittled the list down from over 30 possibles to the six best AI content detectors.
The best AI content detectors at a glance
Accuracy |
Extra features |
Pricing |
|
---|---|---|---|
TraceGPT |
⭐⭐⭐⭐⭐ Basically perfect (and confident) |
Plagiarism checker, authorship verification tool, Chrome extension, custom GPT |
From $5.99 for 20 pages (1 page = 275 words) |
Winston AI |
⭐⭐⭐⭐ Identified Claude as human-generated; did good otherwise |
Plagiarism checker, readability score, ability to scan documents, pictures, and handwriting (OCR), browser extension for instant scanning, custom GPT, integrates with Zapier |
From $12/month (80,000 words) or $19/month (200,000 words) |
Hive |
⭐⭐⭐⭐ Very confident, but 100% wrong on Claude |
Chrome extension |
Free |
GPTZero |
⭐⭐⭐⭐ Didn’t do great with Claude; solid on the rest |
Chrome extension, plagiarism checker, API access, integrations |
Basic free plan for scanning up to 10,000 words per month; premium plans from $10/month |
Originality.ai |
⭐⭐⭐ Mixed; solid on ChatGPT, but not as great on the rest |
Plagiarism checker, readability analysis, automated fact-checker, API access |
From $14.95/month or $30 pay-as-you-go |
Smodin |
⭐⭐⭐ Pretty solid except for Claude; not quite as confident though |
Plagiarism checker, summarizer, rewriter, and writer (generative AI) |
Limited free plan includes 5 free weekly uses; paid plans start at $12/month |
The best AI content detector for accuracy
TraceGPT
TraceGPT (also referred to as AI Plagiarism Checker & ChatGPT Content AI Detector) is part of PlagiarismCheck.org.
TraceGPT accuracy: Basically perfect (and confident)
TraceGPT scored top marks for accuracy (and got bonus points for its quick processing speed). These were the results:
-
Human: 0.00% Potentially AI-generated
-
ChatGPT: 99.91% Potentially AI-generated
-
Claude: 99.93% Potentially AI-generated
-
Mixed: 46.02% Potentially AI-generated
How it works
To use this AI content detector, you’ll need to sign up for an account either as an individual or a team/organization. Then, you can copy/paste your text directly into the app or upload a file (.doc/.docx/.txt/.odt/.rtf/.pdf). Click Proceed, and TraceGPT quickly returns a result and highlights what it believes is the AI-generated text. Note: The maximum text length the AI detector can process in one go is 307,200 characters (~170 pages). Not bad.
For example, with the Mixed content, it reckoned 46.02% of the content was potentially AI-generated. On screen, it highlighted the AI-detected content (in different shades) for Likely (38.22%) and Highly Likely (7.80%). You also have the option to download a PDF report of its findings.
Extra features
TraceGPT pricing: There’s no free plan advertised, but after creating an account, I ran my tests for AI without purchasing a subscription. If you want to use the plagiarism checker, you’ll need a subscription, starting at $5.99 for 20 pages (1 page = 275 words). TraceGPT comes as a free addition to the Plagiarism Detector’s plans. They told me that if you only need the AI detector, you should contact PlagiarismCheck.org for a custom offer.
The best AI content detector for integrations
Winston AI
Winston AI is a dedicated AI content detector that works with GPT-4, Google Gemini, and other LLMs.
Winston AI accuracy: Identified Claude as human-generated; did pretty good on everything else
It failed on one of the tests, identifying Claude as probably human content:
-
Human: Probably 85% human.
-
ChatGPT: Highly probable that an AI gen tool was used. Only possibly 7% human.
-
Claude: Probably 82% human.
-
Mixed: Winston has detected the text as 42% human. Our assessment is that an AI tool was likely used to generate all, or a good part of the text. (Most of the text was correctly identified.)
How it works
You’ll need to create an account to use Winston AI and get a 7-day free trial. Once that’s set up, you have three options for checking your content: paste your text, upload a file, or import from a URL.
Winston AI requires a minimum of 500 characters to test, and then lets you know, on a scale of 0-100, the probability the text is human or AI-generated. You also get AI sentences highlighted in the results and the option to generate a shareable PDF report.
Extra features
-
Plagiarism checker
-
Readability score
-
Ability to scan documents, pictures, and handwriting (OCR)
-
Lots of browser extensions (Microsoft Edge, Opera, Firefox, Google Chrome)
-
Custom GPT
Winston AI also integrates with Blackboard, and Google Classroom, and businesses can also access the tool via an API to integrate with their systems. Or, you can just integrate Winston AI with Zapier to connect it to all the other apps you use, so you can automate your AI content detection workflows. Here are some examples to get you started.
Winston AI pricing: The advertised free account is actually a free trial, limited to 2,000 words over 7 days. Premium plans start at $12/month (80,000 words) or $19/month (200,000 words). You can also get a custom plan if you need to scale further.
The best free AI content detector
Hive
The Hive Moderation AI-generated content detection tool is part of Hive’s automated content moderation tools. It can also detect AI-generated images, videos, and audio.
Accuracy: Very confident, but 100% wrong on the Claude sample
Hive failed on Claude but got the other content spot on:
-
Human: 0% – The input is not likely to contain AI Generated Text.
-
ChatGPT: 99.9% – The input is likely to contain AI Generated Text.
-
Claude: 0% – The input is not likely to contain AI Generated Text.
-
Mixed: 99.9% – The input is likely to contain AI Generated Text. (And it highlighted the two segments of human and AI content correctly.)
How it works
You don’t need an account to use Hive’s AI text detector. Simply paste your text (up to 8,192 characters) into the input box. The text must be greater than 750 characters (preferably 1,500 characters) to get a fair result.
Hive then gives a probability score of how likely the text contains AI-generated text and highlights the affected segment. That’s it—there are no other reports to download, but for a free tool, it’s enough.
Extra features
Hive pricing: Free
The best AI content detector for extra writing analysis features
GPTZero
GPTZero specializes in detecting content from GPT-3, GPT-4, Gemini, Claude, and Llama models. It uses what it calls a seven-layer detection model to determine AI-generated content. Probably not as tasty as the dip.
Accuracy: Didn’t do great with Claude; it was confused by the mixed content, but did good on the rest
Although GPTZero claims it can detect content from Claude, it definitely failed the test. It was fine with the Human and ChatGPT tests, but wasn’t 100% sure about the mixed content.
-
Human: 95% human. We are highly confident this text is entirely human.
-
ChatGPT: 100% AI. We are highly confident this text was AI-generated.
-
Claude: 88% human / 5% mixed / 7% AI. We are moderately confident this text is entirely human.
-
Mixed: 53% human / 5% mixed / 42% AI. We are uncertain about this document. If we had to classify it, it would likely be considered human.
How it works
GPTZero starts with a welcome tutorial, but it’s easy to navigate and work out yourself if you want to skip it. Like the other apps, you can copy/paste the text you want to analyze (min 250 / max 5,000 characters) into the input box or upload a file. The scan runs quickly and presents the results on screen.
In the Scan Summary, there’s a Document Classification—e.g., “human”—and a Probability Breakdown, showing a sliding scale from human to mixed to AI. You can keep the report private, share it, or download a copy. The scan results are also stored in your dashboard, so you can always refer back to them.
The Basic Scan section highlights sentences that are likely AI-generated. Premium plans can also access the Deep Scan, which has color-coded highlights for the different AI and human segments.
Finally, in the Writing Analysis section, you get a detailed breakdown, including readability, average sentence length, and simplicity. The analysis also includes measures of perplexity and burstiness—two AI scoring parameters:
-
Perplexity measures how complex the text is. If GPTZero is “perplexed,” it’s more likely to be human-written. Otherwise, it’s likely AI-generated.
-
Burstiness evaluates the variations of sentences. AI bots tend to stitch sentences together at a predictable uniform length, while humans write with greater variations.
Extra features
-
Google Chrome extension (named Origin)
-
Plagiarism checker
-
API access for large organizations
-
Several integrations, including Google Docs and Microsoft Word add-ons, Canvas, Blackboard, and other classroom applications
GPTZero pricing: There’s a basic free plan for scanning up to 10,000 words per month and 7 scans per hour. Premium plans start from $10/month (150,000 words), rising to $23/month for organizations and enterprises (500,000 words) with advanced data security and SSO.
The best AI content detector for different models based on risk tolerance
Originality.ai
Originality.ai caters to content publishers, agencies, and writers and covers multiple models, including GPT-4 and Claude 2.
Accuracy: Mixed; solid on ChatGPT, but not as great on the rest
Originality.ai has two AI detection models—Standard 2.0 and Turbo 3.0—which provide widely differing scores. They recommend using Turbo 3.0 if you have a zero risk tolerance, as it can detect even a whiff of AI—or so it says—and Standard 2.0 if you’re ok with slight use of AI, such as AI editing.
You can see the marked difference and mixed results depending on the AI detection model used:
-
Human: 83% Original 17% AI (Standard 2.0) vs. 44% Original 56% AI (Turbo 3.0)
-
ChatGPT: 0% Original 100% AI (Standard 2.0) vs. 0% Original 100% AI (Turbo 3.0)
-
Claude: 100% Original 0% AI (Standard 2.0) vs. 49% Original 51% AI (Turbo 3.0)
-
Mixed: 50% Original 50% AI (Standard 2.0) vs. 9% Original 91% AI (Turbo 3.0)
Based on these results, Turbo 3.0 is too strict, and the Standard 2.0 results are actually more accurate.
How it works
Once you’ve created an account with Originality.ai, you can paste or write content in the input box, choose an AI detection model, and then start the scan. Although it’s easy to use, I do think the web app could have been faster.
In terms of results, you get an overall score and sentence-level highlighting, and all scans are stored in your dashboard. You can also assign roles with permission levels to each team member.
Extra features
Originality.ai pricing: There’s no free plan, but you can get 50 credits by installing the free AI detection Chrome Extension to trial its detection capabilities. (One credit can scan 100 words.) There are two premium plans: a $30 pay-as-you-go option or a $14.95/month subscription.
The best AI content detector for affordable unlimited use
Smodin
Smodin offers a suite of writing tools including an AI content detector that works with ChatGPT, Bard, and other AI generators.
Accuracy: Pretty solid except for Claude; not quite as confident though
Smodin scored quite well on the tests. It correctly identified three content sources but failed on the Claude content. Having said that, I got two sets of results when I left a few days between tests, which either suggests the tool is inconsistent or, more likely, has been retrained and updated. (You’ll see both tests below.)
-
Human: Content is likely Human written. (24.8% vs. 9.2% likelihood of complete AI content.)
-
ChatGPT: Content is likely AI written. (81.4% vs. 62.4% likelihood of complete AI content.)
-
Claude: Content is likely Human and AI. (57.4% vs. 12.1% likelihood of complete AI content.)
-
Mixed: Content is likely Human and AI. (60.8% vs. 31.7% likelihood of complete AI content.)
How it works
Like the other apps, Smodin is easy to use: simply paste your text into the input box or upload a file. You can input up to 5,000 characters on the free plan and up to 50,000 characters on the Ultimate plan, with an option to scale further on a custom enterprise plan. The on-screen results highlight AI-generated segments and sentences.
Extra features
Smodin pricing: The limited free plan includes 5 free weekly uses. Paid plans with unlimited usage start at $12/month (annual billing).
Claude detection failures
One common thread in my test results was that all of these tools—except for TraceGPT—failed to identify the Claude text as AI-generated.
Another tool, Trinka, also identified Claude as AI-generated text but didn’t make the cut as it oddly failed elsewhere:
-
Human: AI-Generated Text (83.95%)
-
ChatGPT: AI-Generated Text (100.00%)
-
Claude: AI-Generated Text (74.17%)
-
Mixed: AI-Generated Text (100.00%)
The fact that most detectors failed on Claude indicates to me that (a) Claude is better at creating human-sounding content (which is generally the consensus among writers) and (b) most of these tools are probably trained mostly on GPT and not as much on Claude.
Should you use (and trust) an AI content detector?
The AI content landscape is changing constantly. Although AI detection tools are improving, they still have limitations. Take these results, for example. In some cases, they can’t distinguish between highly sophisticated AI-generated text and human-written text. As AI content generation tools develop ways to sound more human, content detection models need training on more examples. It’s a proverbial game of cat and mouse.
In short, AI content detectors, like AI content generators, are imperfect, so tread carefully and use common sense.
Related reading:
This article was originally published in July 2023 by Shubham Agarwal. The most recent update was in May 2024.