adversarial-testing

#adversarial-testing

From 0-Order Selection to 2-Order Judgment: Combinatorial Hardening Exposes Compositional Failures in Frontier LLMs

arXiv cs.CL ↗ · 2d ago Cached

This paper introduces LogiHard, a framework that uses combinatorial hardening to expose compositional failures in frontier LLMs, demonstrating significant accuracy drops in logical reasoning tasks.

0 favorites 0 likes

#adversarial-testing

Advancing red teaming with people and AI

OpenAI Blog ↗ · 2024-11-21 Cached

OpenAI publishes a white paper detailing their approach to external red teaming for AI models, outlining methods for selecting diverse red team members, determining model access levels, providing testing infrastructure, and synthesizing feedback to improve AI safety and policy coverage.

0 favorites 0 likes

adversarial-testing

From 0-Order Selection to 2-Order Judgment: Combinatorial Hardening Exposes Compositional Failures in Frontier LLMs

Advancing red teaming with people and AI

Submit Feedback