adversarial

Tag

Cards List
#adversarial

Adversarial Creation and Detection of AI-Generated Social Bot Content

arXiv cs.CL · yesterday Cached

This paper presents an adversarial methodology for creating and detecting AI-generated social bot content, curating a multilingual, cross-platform dataset of paired human and AI messages. Training on this adversarial data yields detection that significantly outperforms existing content-based bot detection models in real-world settings.

0 favorites 0 likes
#adversarial

Deliberative Curation: A Protocol for Multi-Agent Knowledge Bases

arXiv cs.AI · 2026-06-02 Cached

This paper introduces a deliberative curation protocol for multi-agent knowledge bases, addressing governance gaps such as agent statelessness and sycophancy. It evaluates the protocol via simulation, showing improved resilience under adversarial conditions.

0 favorites 0 likes
#adversarial

CSULoRA: Closest Safe Update Low-Rank Adaptation

arXiv cs.LG · 2026-06-01 Cached

CSULoRA is a post-hoc method for correcting trained LoRA adapters to preserve safety alignment while maintaining utility, using closest safe update estimation.

0 favorites 0 likes
#adversarial

I built a Hermes Skill where 3 AI models argue with each other before giving you an answer - adversarial multi-model consensus with RRF + Borda Count ranking

Reddit r/AI_Agents · 2026-05-31

PolyGnosis is an adversarial multi-model consensus system built as a Hermes skill. It runs three AI models in parallel with different expert personas, then has a hostile critic phase, scoring via RRF and Borda Count, and a synthesis gate—all built agentically using DeepSeek V4-Pro.

0 favorites 0 likes
#adversarial

Hidden Human-Like Nature of Machine-Generated Texts: Theory and Detection Enhancement

arXiv cs.CL · 2026-05-25 Cached

This paper reveals the existence of hidden human-like spans in machine-generated texts and proposes a model-agnostic stacked enhancement framework that improves existing detectors by reducing the influence of these spans.

0 favorites 0 likes
#adversarial

Coloring the Noise: Adversarial Sobolev Alignment for Faithful Image Super Resolution

Hugging Face Daily Papers · 2026-05-22 Cached

This paper proposes an adversarial Sobolev alignment method for faithful image super resolution, aiming to reduce artifacts and improve fidelity.

0 favorites 0 likes
#adversarial

I built two multi-agent AI systems with completely opposite philosophies. Here's what I've learned so far.

Reddit r/AI_Agents · 2026-05-20

The author builds two multi-agent AI systems with opposite design philosophies: ChaoticAI (collaborative, org-chart-based) and S.A.G.E. with RAAC (adversarial argumentation). The post shares reflections on memory architecture and the potential synthesis of both approaches.

0 favorites 0 likes
#adversarial

NewsLens: A Multi-Agent Framework for Adversarial News Bias Navigation

arXiv cs.CL · 2026-05-19 Cached

NewsLens introduces a multi-agent framework designed to navigate and expose adversarial news bias, proposing a novel approach to identifying and countering biased content in news media.

0 favorites 0 likes
#adversarial

ALSO: Adversarial Online Strategy Optimization for Social Agents

arXiv cs.AI · 2026-05-18 Cached

ALSO introduces a framework for online strategy optimization in multi-agent social simulation, formulating multi-turn interaction as an adversarial bandit problem and using a neural surrogate for reward prediction. Experiments on the Sotopia benchmark show it outperforms static baselines and existing optimization methods.

0 favorites 0 likes
#adversarial

Chainwash: Multi-Step Rewriting Attacks on Diffusion Language Model Watermarks

arXiv cs.CL · 2026-05-08 Cached

This research paper introduces Chainwash, a multi-step rewriting attack that effectively removes statistical watermarks from diffusion language model (LLaDA-8B-Instruct) outputs, reducing detection rates from 87.9% to 4.86% after five chained rewrites.

0 favorites 0 likes
← Back to home

Submit Feedback