Tag
This paper proposes RidgeFT, a lightweight analytic update framework for lifelong machine-generated text attribution that adapts to new text generators without forgetting old ones, achieving strong performance across multiple evaluation settings.
This paper reveals the existence of hidden human-like spans in machine-generated texts and proposes a model-agnostic stacked enhancement framework that improves existing detectors by reducing the influence of these spans.
This paper investigates the resilience of AI-generated text detection methods (fine-tuned RoBERTa, Binoculars, text feature analysis, and ensembles) against paraphrasing attacks, finding that Binoculars-inclusive ensembles are most effective but also most vulnerable to attacks, highlighting a dichotomy between performance and resilience.
This paper evaluates 15 machine-generated text detection models across six systems and multiple datasets, finding high variance in model rankings based on dataset and metric choices, with poor performance on novel human-written texts in high-risk domains. The authors highlight that methodological choices in evaluation are critical for accurately reflecting model performance.