Raidar: geneRative AI Detection viA Rewriting

We find that large language models (LLMs) are more likely to modifyhuman-written text than AI-generated text when tasked with rewriting. Thistendency arises because LLMs often perceive AI-generated text as high-quality,leading to fewer modifications. We introduce a method to detect AI-generatedcontent by prompting LLMs to rewrite text and calculating the editing distanceof the output. We dubbed our geneRative AI Detection viA Rewriting methodRaidar. Raidar significantly improves the F1 detection scores of existing AIcontent detection models – both academic and commercial – across variousdomains, including News, creative writing, student essays, code, Yelp reviews,and arXiv papers, with gains of up to 29 points. Operating solely on wordsymbols without high-dimensional features, our method is compatible with blackbox LLMs, and is inherently robust on new content. Our results illustrate theunique imprint of machine-generated text through the lens of the machinesthemselves.

Further reading