A 22-year-old student from Princeton, Edward Tian, has designed an app to discern whether text is human or AI generated. The tool, GPTZero, is already garnering interest from potential investors and will come as a sigh of relief to teachers and others who are worried about the advanced abilities of ChatGPT, OpenAI’s new text generator. According to a piece from Fast Company,
Tian says his tool measures randomness in sentences (“perplexity”) plus overall randomness (“burstiness”) to calculate the probability that the text was written by ChatGPT. Since tweeting about GPTZero on January 2, Tian says he’s already been approached by VCs wanting to invest and will be developing updated versions soon.”Megan Morrone, Was this written by a robot? These tools help detect AI-generated text (fastcompany.com)
If teachers do use the app, however, they may end up unjustly accusing a minority of students for misconduct when the detector to fails to makes accurate guesses.
In addition, OpenAI is reportedly working on a new software program that will “watermark” any text generated by ChatGPT. Scott Aaronson, a visiting researcher at OpenAI, says,
Basically, whenever GPT generates some long text, we want there to be an otherwise unnoticeable secret signal in its choices of words, which you can use to prove later that, yes, this came from GPT. We want it to be much harder to take a GPT output and pass it off as if it came from a human. This could be helpful for preventing academic plagiarism, obviously, but also, for example, mass generation of propaganda—you know, spamming every blog with seemingly on-topic comments supporting Russia’s invasion of Ukraine, without even a building full of trolls in Moscow. Or impersonating someone’s writing style in order to incriminate them.”Scott Aaronson, Shtetl-Optimized » Blog Archive » My AI Safety Lecture for UT Effective Altruism (scottaaronson.blog)
Aaronson says he didn’t expect AI to be this advanced in 2023 ten or even five years ago. These new detector tools should help.
There’s also the GPT-2 output detector model, which measures the probability of whether text output is human or artificial. OpenAI engineers say it is quite good but needs some more work to be perfected.
In 2019, researchers from the MIT-IBM Watson AI Lab and the Harvard Natural Language Processing Group teamed up to design their own detector algorithm. Since humans are more likely to use unlikely word choices, the algorithm tries to guess the next word in a sentence. If it can, it concludes that the text is AI generated. While the algorithms are not perfect, they have a pretty high success rate in discerning the author of output text.
While new AI systems are impressive, a slew of brilliant software engineers are using AI detection to regulate AI.