error-prediction

#error-prediction

Adversarial Concept Search: Predicting Compositional Errors From Feature Geometry

arXiv cs.AI ↗ · 2026-06-15 Cached

This paper proposes Adversarial Concept Search, a method that uses the representational geometry of large language models to predict compositional failures without evaluating specific inputs. The approach identifies high-risk scenarios by measuring interference between salient features.

0 favorites 0 likes

error-prediction

Adversarial Concept Search: Predicting Compositional Errors From Feature Geometry

Submit Feedback