Tag
A new semantic-adaptive eviction policy for LLM prefix caches that learns token reuse patterns across different token types, achieving 1.4x-2.7x TTFT improvement over existing policies.
This paper proposes CTO, a method that improves code translation by combining syntax-guided and semantic-aware preference optimization through contrastive learning and direct preference optimization, achieving significant improvements over existing baselines in C++, Java, and Python translations.