Tag
Introduces PEEK, an efficient dynamic frame sampling method that distills caption-conditioned frame relevance rankings from a teacher model into a lightweight temporal model, outperforming state-of-the-art methods in video captioning while maintaining computational efficiency.
SAI-DPO introduces a dynamic sampling framework that adapts training data to a model's evolving capabilities during mathematical reasoning tasks, using self-aware difficulty metrics and knowledge semantic alignment to achieve state-of-the-art efficiency with less data on benchmarks like AIME24 and AMC23.