Base Models Look Human To AI Detectors
Summary
A research paper finds that base language models appear human to AI detectors, unlike instruction-tuned models. The authors propose a paraphrasing pipeline (HIP) that improves human-likeness while preserving semantics across model sizes.
View Cached Full Text
Cached at: 05/20/26, 10:40 PM
Paper page - Base Models Look Human To AI Detectors
Source: https://huggingface.co/papers/2605.19516
Abstract
Instruction-tuned language models produce text that commercial detectors identify as non-human, prompting the development of a paraphrasing pipeline that improves human-likeness while preserving semantics across different model sizes.
As AI-generated text enters the real-world at scale, institutions increasingly use commercialAI-text detectors, especially in education and academic-integrity workflows. We report a surprising empirical finding about such systems: when evaluated by GPTZero and Pangram, generated text from base models is often judged overwhelmingly human, whereas text generated by their instruction-tuned counterparts is not. Building on this observation, we propose Humanization by IterativeParaphrasing(HIP), a detector-agnostic pipeline that minimally fine-tunes a base model into a paraphraser and applies it iteratively. Compared with the baselines we test, HIP yields a stronger trade-off betweensemantic preservationanddetector evasionon commercial detectors. Across Llama-3 and Qwen-3 families, spanning model sizes from 0.6B to 70B, HIP consistently improves detector human-likeness. Our findings suggest that current detectors are tracking artifacts of instruction tuning and local context more than any invariant notion of machine-generated text. This, in turn, calls for detector designs that model these factors more explicitly.
View arXiv pageView PDFGitHub2Add to collection
Get this paper in your agent:
hf papers read 2605\.19516
Don’t have the latest CLI?curl \-LsSf https://hf\.co/cli/install\.sh \| bash
Models citing this paper14
#### YixuanEvenXu/Llama-3-70B-HIP-adapter Text Generation• Updatedabout 18 hours ago • 13
#### YixuanEvenXu/Llama-3-70B-Instruct-HIP-adapter Text Generation• Updatedabout 18 hours ago • 10
#### YixuanEvenXu/Llama-3-8B-HIP-adapter Text Generation• Updatedabout 18 hours ago • 10
#### YixuanEvenXu/Llama-3-8B-Instruct-HIP-adapter Text Generation• Updatedabout 18 hours ago
Browse 14 models citing this paper## Datasets citing this paper1
#### YixuanEvenXu/HIP-training-and-evaluation-data Viewer• Updatedabout 18 hours ago • 11.1k • 9
Spaces citing this paper0
No Space linking this paper
Cite arxiv.org/abs/2605.19516 in a Space README.md to link it from this page.
Collections including this paper0
No Collection including this paper
Add this paper to acollectionto link it from this page.
Similar Articles
Base Models Look Human To AI Detectors
This paper reveals that commercial AI detectors like GPTZero and Pangram judge text from base language models as overwhelmingly human, while instruction-tuned model outputs are flagged as AI-generated. The authors propose HIP, a detector-agnostic iterative paraphrasing pipeline that improves human-likeness while preserving semantics.
How Human-Like Are Large Language Models? A Register-Aware Linguistic Evaluation Framework
This paper introduces a register-aware linguistic evaluation framework to assess how human-like large language models (LLMs) are by comparing the distribution of 67 lexico-grammatical features between human and LLM-generated texts using Maximum Mean Discrepancy. Experiments across seven instruction-tuned open-source models and five registers show that no model perfectly matches human baselines, and closeness to human language varies by register rather than model size.
Persona-Assigned Large Language Models Exhibit Human-Like Motivated Reasoning
This paper investigates whether assigning personas to large language models induces human-like motivated reasoning, finding that persona-assigned LLMs show up to 9% reduced veracity discernment and are up to 90% more likely to evaluate scientific evidence in ways congruent with their induced political identity, with prompt-based debiasing largely ineffective.
Amplifying, Not Learning: Fine-Tuned AI Text Detectors Amplify a Pretrained Direction
This paper demonstrates that fine-tuned AI text detectors amplify a pretrained typicality axis rather than learning an AI-vs-human boundary, with raw encoder projections often matching or exceeding fine-tuned performance.
Hidden Human-Like Nature of Machine-Generated Texts: Theory and Detection Enhancement
This paper reveals the existence of hidden human-like spans in machine-generated texts and proposes a model-agnostic stacked enhancement framework that improves existing detectors by reducing the influence of these spans.