persona-simulation

#persona-simulation

Beyond Static Benchmarks: Synthesizing Harmful Content via Persona-based Simulation for Robust Evaluation

arXiv cs.CL ↗ · 2026-04-21 Cached

Researchers from KAIST propose a framework that uses persona-guided LLM agents to synthesize diverse harmful content for stress-testing detection systems, addressing limitations of static benchmarks such as scalability, diversity, and data contamination. Both human and LLM evaluations confirm the synthetic scenarios are harder to detect than existing benchmarks while maintaining linguistic and topical diversity.

0 favorites 0 likes

persona-simulation

Beyond Static Benchmarks: Synthesizing Harmful Content via Persona-based Simulation for Robust Evaluation

Submit Feedback