ChatGPT creator rolls out ‘imperfect’ tool to help teachers spot potential cheating

(CNN) — Two months after OpenAI unnerved some educators with the public release of ChatGPT, an AI chatbot that can help students and professionals generate shockingly convincing essays, the company is unveiling a new tool to help teachers adapt.

OpenAI on Tuesday announced a new feature, called an “AI text classifier,” that allows users to check if an essay was written by a human or AI. But even OpenAI admits it’s “imperfect.”

The tool, which works on English AI-generated text, is powered by a machine learning system that takes an input and assigns it to several categories. In this case, after pasting a body of text such as a school essay into the new tool, it will give one of five possible outcomes, ranging from “likely generated by AI” to “very unlikely.”

Lama Ahmad, policy research director at OpenAI, told CNN that educators have been asking for a ChatGPT feature like this, but warns it should be “taken with a grain of salt.”

“We really don’t recommend taking this tool in isolation because we know that it can be wrong and will be wrong at times — much like using AI for any kind of assessment purposes,” Ahmad said. “We are emphasizing how important it is to keep a human in the loop … and that it’s just one data point among many others.”

Ahmad notes that some teachers have referenced past examples of student work and writing style to gauge whether it was written by the student. While the new tool might provide another reference point, Ahmad said “teachers need to be really careful in how they include it in academic dishonesty decisions.”

Since it was made available in late November, ChatGPT has been used to generate original essays, stories and song lyrics in response to user prompts. It has drafted research paper abstracts that fooled some scientists. It even recently passed law exams in four courses at the University of Minnesota, another exam at University of Pennsylvania’s Wharton School of Business and a US medical licensing exam.

In the process, it has raised alarms among some educators. Public schools in New York City and Seattle have already banned students and teachers from using ChatGPT on the district’s networks and devices. Some educators are now moving with remarkable speed to rethink their assignments in response to ChatGPT, even as it remains unclear how widespread use is of the tool among students and how harmful it could really be to learning.

OpenAI now joins a small but growing list of efforts to help educators detect when a written work is generated by ChatGPT. Some companies such as Turnitin are actively working on ChatGPT plagiarism detection tools that could help teachers identify when assignments are written by the tool. Meanwhile, Princeton student Edward Tuan told CNN more than 95,000 people have already tried the beta version of his own ChatGPT detection feature, called ZeroGPT, noting there has been “incredible demand among teachers” so far.

Jan Leike — a lead on the OpenAI alignment team, which works to make sure the AI tool is aligned with human values — listed several reasons for why detecting plagiarism via ChatGPT may be a challenge. People can edit text to avoid being identified by the tool, for example. It will also “be best at identifying text that is very similar to the kind of text that we’ve trained it on.”

In addition, the company said it’s impossible to determine if predictable text — such as a list of the first 1,000 prime numbers — was written by AI or a human because the correct answer is always the same, according to a company blog post. The classifier is also “very unreliable” on short texts below 1,000 characters.

During a demo with CNN ahead of Tuesday’s launch, ChatGPT successfully labeled several bodies of work. An excerpt from the book “Peter Pan,” for example, was deemed “unlikely” to be AI generated. In the company blog post, however, OpenAI said it incorrectly labeled human-written text as AI-written 5% of the time.

Despite the possibility of false positives, Leike said the company aims to use the tool to spark conversations around AI literacy and possibly deter people from claiming that AI-written text was created by a human. He said the decision to release the new feature also stems from the debate around whether humans have a right to know if they’re interacting with AI.

“This question is much bigger than what we are doing here; society as a whole has to grapple with that question,” he said.

OpenAI said it encourages the general public to share their feedback on the AI check feature. Ahmad said the company continues to talk with K-12 educators and those at the collegiate level and beyond, such as Harvard University and the Stanford Design School.

The company sees its role as “an educator to the educators,” according to Ahmad, in the sense that OpenAI wants to make them more “aware about the technologies and what they can be used for and what they should not be used for.”

“We’re not educators ourselves — we’re very aware of that — and so our goals are really to help equip teachers to deploy these models effectively in and out of the classroom,” Ahmad said. “That means giving them the language to speak about it, help them understand the capabilities and the limitations, and then secondarily through them, equip students to navigate the complexities that AI is already introducing in the world.”

Trending