data-poisoning

#data-poisoning

Mental Damage: Caption Poisoning Attacks on Retrieval-Augmented Text-to-Music Generation

arXiv cs.AI ↗ · 3d ago Cached

This paper introduces a dual-layer caption poisoning attack on retrieval-augmented text-to-music systems, showing that an attacker can inject malicious captions into the knowledge database to steer generated music toward attacker-chosen intent without modifying user prompts or models.

0 favorites 0 likes

#data-poisoning

Be Kind, Rewrite: Benign Projections via Rewriting Defend Against LLM Data Poisoning Attacks

Hugging Face Daily Papers ↗ · 2026-05-18 Cached

This paper introduces open-book benign rewriting (OBBR) as a proactive defense against backdoor attacks on LLMs, showing it neutralizes harmful content by projecting to benign prompts, and improves safety by 51% over state-of-the-art defenses.

0 favorites 0 likes

#data-poisoning

What are AI tarpits? Understanding the tools people are using to poison LLMs

Reddit r/ArtificialInteligence ↗ · 2026-05-17 Cached

AI tarpits are tools used by content creators to poison large language models by feeding scrapers useless or incorrect data, degrading AI output quality.

0 favorites 0 likes

#data-poisoning

SoK: A Comprehensive Analysis of the Current Status of Neural Tangent Generalization Attacks with Research Directions

arXiv cs.LG ↗ · 2026-05-14 Cached

This paper presents a comprehensive analysis of the Neural Tangent Generalization Attack (NTGA) for data protection, including a taxonomy of related attacks, and discusses future research directions.

0 favorites 0 likes

#data-poisoning

When Emotion Becomes Trigger: Emotion-style dynamic Backdoor Attack Parasitising Large Language Models

arXiv cs.CL ↗ · 2026-05-13 Cached

This paper introduces Paraesthesia, a dynamic backdoor attack on LLMs that uses emotional style as a stealthy trigger during fine-tuning, achieving high success rates while maintaining model utility.

0 favorites 0 likes

data-poisoning

Mental Damage: Caption Poisoning Attacks on Retrieval-Augmented Text-to-Music Generation

Be Kind, Rewrite: Benign Projections via Rewriting Defend Against LLM Data Poisoning Attacks

What are AI tarpits? Understanding the tools people are using to poison LLMs

SoK: A Comprehensive Analysis of the Current Status of Neural Tangent Generalization Attacks with Research Directions

When Emotion Becomes Trigger: Emotion-style dynamic Backdoor Attack Parasitising Large Language Models

Submit Feedback