"AI Writing Detection at Its Limits: Questioning the Fundamentals | META-X"

"Spoofing Attacks That Make Text Look AI-Written Are Also Possible"

According to a 2025 University of Maryland research paper ("Can AI-Generated Text be Reliably Detected?" Vinu Sankar Sadasivan et al.), most detectors identifying AI-written text are easily neutralized in the face of actual attacks (circumvention, paraphrasing, etc.). This fires a warning shot at settings that automatically stamp "AI-written text" labels.

Maryland research team experiments found that the vast majority of AI text detection tools experience greatly reduced detection capability from a simple attack called "recursive paraphrasing." Regardless of watermarking, neural networks, zero-shot, or search-based methods: watermarking-based detector rates dropped from 99.8% to 80%+ range with one paraphrasing, below 20% with two or more iterations; deep learning-based detectors dropped from 100% to 60%; search-based detectors became meaningless after just 5 paraphrasing iterations; zero-shot methods (DetectGPT) dropped from AUROC 96.5% to 25%. Meaning repeatedly having AI rewrite AI-written text renders most detectors no longer effective.

Quality evaluation showed that DIPPER paraphrasing maintained 70% as "high quality" (Likert 4-5) for semantic preservation and 89% for grammar/text quality; LLaMA-2-7B-Chat results showed 83% rated "high quality." After 5 rounds of paraphrasing, Perplexity only slightly increased (DIPPER: 5.5→8.7, LLaMA-2: 10.5), and SQuAD-v2 QA benchmarks showed 97% accuracy — confirming almost no information loss. The research team also demonstrated "Spoofing attacks" — making human-written text appear AI-written: watermarking-based systems can misidentify human-written text as AI-written by knowing the token pattern (''green list''); search-based systems show false positives on original human text if stored as AI-rewritten in the database. The paper mathematically proves: "The more AI text and human text distributions come to resemble each other, any detector will fundamentally decline in performance." True and false positive rates present an inevitable trade-off. Conclusion: AI detectors cannot be perfect; "AI detection = evidence" is a dangerous myth. Multi-layered evaluation, process-centered evidence, and verbal assessment must be used in parallel in educational and journalism settings. It''s very dangerous to base assessment and disciplinary standards solely on detectors given these clearly demonstrated technical limitations.

"AI Writing Detection at Its Limits — Questioning From the Fundamentals"

Related Articles

Anthropic Raises $65 Billion — The Era of the '$1 Trillion AI Company' Is Almost Here | META-X

Hyundai N Racing Simulator & Driving Joy | META-X

MMORPG History: The Shared World Dream | META-X

Related Articles

AI·테크
Anthropic Raises $65 Billion — The Era of the '$1 Trillion AI Company' Is Almost Here | META-X
이든 기자 · 2026.05.30

AI·테크
Hyundai N Racing Simulator & Driving Joy | META-X
김하영 기자 · 2026.05.21

AI·테크
MMORPG History: The Shared World Dream | META-X
김하영 기자 · 2026.05.20