hessian-vector-product

Tag

Cards List
#hessian-vector-product

When Attribution Patching Lies: Diagnosis and a Second-Order Correction

arXiv cs.LG · 2026-06-10 Cached

This paper diagnoses systematic errors in attribution patching, a gradient-based approximation used for causal localization in language models, and proposes a second-order correction using Hessian-vector products that improves reliability with minimal additional computational cost.

0 favorites 0 likes
← Back to home

Submit Feedback