The Failure of GPTZero

GPTZero touts itself as “The best AI checker”. It’s page is filled with reviews from an assortment of individuals and organizations who each talk about how much they think it’s the premier service.

Let’s take a look at one of those reviews:

The granular detail provided by GPTZero allows administrators to observe AI usage across the institution. Faculty can identify AI usage specific to their courses, students, and assignments. This data is helping guide us on what type of education, parameters, and policies need to be in place to promote an innovative and healthy use of AI.

This review was posted (according to the website) by the director of E-Learning at South Piedmont Community College. It has a probability of 51% of being written by an AI according to GPTZero (you can copy and paste it into there as well and see). That’s weird? You would think that GPTZero would want it’s page to have human written reviews. It almost certainly does, but it just doesn’t work.

GPTZero provides testimony intended to be used to determine whether a block of text was written by any of the leading large language models. To directly quote them:

Our AI detection model contains 7 components that process text to determine if it was written by AI. We utilize a multi-step approach that aims to produce predictions that reach maximum accuracy, with the least false positives. Our model specializes in detecting content from Chat GPT, GPT 3, GPT 4, Bard, and LLaMa models.

It’s worth noting that the sentances I highlighted were predicted to have been written by an AI.

This clear failure as a model (seriously, take a look at their website, copy answers from their own FAQs and other sections and check them) shows how this cannot be trusted, but people are using this as a plagarism detection service. Unfortunately I ran out of free uses for their service so I was unable to continue checking some actual AI-generated text, but for the one or two attempts I did make the results hit the same percentage as their own website’s text.

Here’s one example (I asked ChatGPT to generate an interesting story in two sentences):

As the clock struck midnight, the abandoned mansion began to whisper secrets from its centuries-old walls. Shadows danced, and forgotten echoes of laughter filled the air, inviting the curious to step into a world where time had no hold, and mysteries begged to be unraveled.

GPTZero said there was a 51% chance that this text was AI Generated. Let’s pause for a second here and look at that number. That’s the exact same number as their own website’s text. That’s the exact same numebr that I’ve gotten out of almost every input into their model (there was one testimony on their website that produced a 4%, but that’s the only time I’ve seen anything other than 51%).

It seems unlikely that almost 100% of the time, the content is 51% likely to have been written by an LLM. Unless it were just guessing, which it likely is.

While testimony generated from GPT should be taken with a grain of salt, testimony taken from tools designed to detect whether they were generated by an LLM should also. Detection models have never worked terribly well for any field, why should we believe they suddenly work well for some of the most advanced technology we have ever created? These models should be used as tools because they are a part of our lives now and certainly are not going anywhere.

OpenAI (the creator of ChatGPT and DALL-E) even tried to make an AI detector, but had to shut it down because it was too inaccurate (according to the Washington Post - I cannot find any first-hand info about this other than an image detection tool that has not released yet).

As a footnote: GPTZero believes that this post was written entirely by a human (all sentences including the quotes). As one of those quotes was entirely AI generated, it seems that there are many ways to trick it.