data-attribution

Tag

Cards List
#data-attribution

GRASP: Geometry-aware Residual Alignment for Scalable Pretraining Data Attribution

arXiv cs.LG · 2d ago Cached

GRASP introduces a geometry-aware, interaction-based method for scalable pretraining data attribution that models subset dynamics, outperforming existing additive approaches by over double the task-level rank correlation while reducing computation costs.

0 favorites 0 likes
#data-attribution

How Faithful Is Trajectory-Based Data Attribution? Error Sources, Remedies, and Practical Guidelines

arXiv cs.LG · 2026-05-20

This paper provides the first systematic analysis of error sources in trajectory-based data attribution methods, identifies optimizer mismatch as the dominant error, proposes AdamW-influence to address it, and offers practical guidelines for data selection via a K-step look-ahead framework.

0 favorites 0 likes
← Back to home

Submit Feedback