edge-attribution-patching

Tag

Cards List
#edge-attribution-patching

How Much Do Circuits Tell Us? Measuring the Consistency and Specificity of Language Model Circuits

arXiv cs.CL · 2d ago Cached

This paper evaluates the consistency and specificity of language model circuits, finding that while circuits are consistent within tasks, they lack task-specificity due to substantial overlap across different tasks.

0 favorites 0 likes
← Back to home

Submit Feedback