PreScam: A Benchmark for Predicting Scam Progression from Early Conversations

Hugging Face Daily Papers 05/12/26, 12:00 AM Papers

ai-safety scam-detection benchmark conversational-ai nlp security

Summary

PreScam is a benchmark for modeling scam progression in multi-turn conversations, built from real-world scam reports. It includes tasks like real-time termination prediction and scammer action prediction, finding that supervised encoders outperform zero-shot LLMs.

Conversational scams, such as romance and investment scams, are emerging as a major form of online fraud. Unlike one-shot scam lures such as fake lottery or unpaid toll messages, they unfold through multi-turn conversations in which scammers gradually manipulate victims using evolving psychological techniques. However, existing research mainly focuses on static scam detection or synthetic scams, leaving open whether language models can understand how real-world scams progress over time. We introduce PreScam, a benchmark for modeling scam progression from early conversations. Built from user-submitted scam reports, PreScam filters and structures 177,989 raw reports into 11,573 conversational scam instances spanning 20 scam categories. Each instance is hierarchically structured according to the scam lifecycle defined by the proposed scam kill chain, and further annotated at the turn level with scammer psychological actions and victim responses. We benchmark models on two tasks: real-time termination prediction, which estimates whether a conversation is approaching the termination stage, and scammer action prediction, which forecasts the scammer's subsequent actions. Results show a clear gap between surface-level fluency and progression modeling: supervised encoders substantially outperform zero-shot LLMs on real-time termination prediction, while next-action prediction remains only moderately successful even for strong LLMs. Taken together, these results show that current models can capture some scam-related cues, yet still struggle to track how risk escalates and how manipulation unfolds across turns.

Original Article

View Cached Full Text

Cached at: 05/15/26, 04:26 PM

Paper page - PreScam: A Benchmark for Predicting Scam Progression from Early Conversations

Source: https://huggingface.co/papers/2605.12243

Abstract

PreScam benchmark enables modeling of scam progression through multi-turn conversations by structuring real-world reports according to a scam kill chain and annotating psychological actions and victim responses.

Conversational scams, such as romance and investment scams, are emerging as a major form of online fraud. Unlike one-shot scam lures such as fake lottery or unpaid toll messages, they unfold through multi-turn conversations in which scammers gradually manipulate victims using evolving psychological techniques. However, existing research mainly focuses on static scam detection or synthetic scams, leaving open whether language models can understand how real-world scams progress over time. We introduce PreScam, a benchmark for modelingscam progressionfrom early conversations. Built from user-submitted scam reports, PreScam filters and structures 177,989 raw reports into 11,573 conversational scam instances spanning 20 scam categories. Each instance is hierarchically structured according to the scam lifecycle defined by the proposedscam kill chain, and further annotated at the turn level with scammerpsychological actionsandvictim responses. We benchmark models on two tasks:real-time termination prediction, which estimates whether a conversation is approaching the termination stage, and scammer action prediction, which forecasts the scammer’s subsequent actions. Results show a clear gap between surface-level fluency and progression modeling:supervised encoderssubstantially outperformzero-shot LLMsonreal-time termination prediction, whilenext-action predictionremains only moderately successful even forstrong LLMs. Taken together, these results show that current models can capture some scam-related cues, yet still struggle to track how risk escalates and how manipulation unfolds across turns.

View arXiv page View PDF Add to collection

Get this paper in your agent:

hf papers read 2605\.12243

Don’t have the latest CLI?curl \-LsSf https://hf\.co/cli/install\.sh \| bash

Models citing this paper0

No model linking this paper

Cite arxiv.org/abs/2605.12243 in a model README.md to link it from this page.

Datasets citing this paper0

No dataset linking this paper

Cite arxiv.org/abs/2605.12243 in a dataset README.md to link it from this page.

Spaces citing this paper0

No Space linking this paper

Cite arxiv.org/abs/2605.12243 in a Space README.md to link it from this page.

Collections including this paper0

No Collection including this paper

Add this paper to acollectionto link it from this page.

PreScam: A Benchmark for Predicting Scam Progression from Early Conversations

Paper page - PreScam: A Benchmark for Predicting Scam Progression from Early Conversations

Abstract

Models citing this paper0

Datasets citing this paper0

Spaces citing this paper0

Collections including this paper0

Similar Articles

ORACLE: Anticipating Scams from Partial Trajectories in Streaming App Usage

PromptScout

An Expanded Synthetic Conversation Dataset for Multi-Turn Smishing Detection

Been watching real adversarial input hit my detection API for six months. Here's what's actually landing.

@HowToAI_: Microsoft Research + Salesforce has published a paper that should scare every single AI builder right now. It’s called …

Submit Feedback

Similar Articles

ORACLE: Anticipating Scams from Partial Trajectories in Streaming App Usage

An Expanded Synthetic Conversation Dataset for Multi-Turn Smishing Detection

Been watching real adversarial input hit my detection API for six months. Here's what's actually landing.

@HowToAI_: Microsoft Research + Salesforce has published a paper that should scare every single AI builder right now. It’s called …