Reinforcement Learning Elicits Contextual Learning of Unseen Language Translation

Hugging Face Daily Papers Papers

Summary

This paper proposes a reinforcement learning approach to enable large language models to translate unseen languages by leveraging in-context linguistic knowledge, outperforming in-context learning and supervised fine-tuning.

Prior work has shown that large language models (LLMs) can translate unseen or low-resource languages by undergoing continued training or even by encoding a grammar book in their context. However, both methods typically overfit specific languages, with limited zero-shot transfer at test time. To translate extremely low-resource languages at scale, we argue that LLMs must acquire the meta-skill of utilizing in-context linguistic knowledge rather than memorizing specific languages. In this paper, we propose a reinforcement learning (RL) approach to unseen language translation given rich linguistic context, using a surface-level translation metric (chrF) as the reward. Empirically, despite the lightweight reward, our RL-trained models effectively extract and apply relevant linguistic information from the provided context, leading to better translations on completely unseen languages than in-context learning or supervised fine-tuning. Our analyses suggest that outcome-based RL can extend beyond conventional reasoning tasks like math and coding to serve as a recipe for language learning from context.
Original Article
View Cached Full Text

Cached at: 06/05/26, 06:07 AM

Paper page - Reinforcement Learning Elicits Contextual Learning of Unseen Language Translation

Source: https://huggingface.co/papers/2606.06428

Abstract

Reinforcement learning approach enables large language models to translate unseen languages by leveraging in-context linguistic knowledge rather than memorizing specific languages.

Prior work has shown thatlarge language models(LLMs) can translate unseen or low-resource languages by undergoing continued training or even by encoding a grammar book in their context. However, both methods typically overfit specific languages, with limitedzero-shot transferat test time. To translate extremely low-resource languages at scale, we argue that LLMs must acquire themeta-skillof utilizing in-context linguistic knowledge rather than memorizing specific languages. In this paper, we propose areinforcement learning(RL) approach to unseen language translation given richlinguistic context, using a surface-level translation metric (chrF) as the reward. Empirically, despite the lightweight reward, our RL-trained models effectively extract and apply relevant linguistic information from the provided context, leading to better translations on completely unseen languages thanin-context learningorsupervised fine-tuning. Our analyses suggest that outcome-based RL can extend beyond conventional reasoning tasks like math and coding to serve as a recipe for language learning from context.

View arXiv pageView PDFGitHub2Add to collection

Get this paper in your agent:

hf papers read 2606\.06428

Don’t have the latest CLI?curl \-LsSf https://hf\.co/cli/install\.sh \| bash

Models citing this paper0

No model linking this paper

Cite arxiv.org/abs/2606.06428 in a model README.md to link it from this page.

Datasets citing this paper1

#### HanxuHU/rl-new-language Viewer• Updatedabout 2 hours ago • 135k • 71

Spaces citing this paper0

No Space linking this paper

Cite arxiv.org/abs/2606.06428 in a Space README.md to link it from this page.

Collections including this paper1

Similar Articles

Translate-R1: Cost-Aware Translation Tool Use via Reinforcement Learning

arXiv cs.CL

Translate-R1 introduces a reinforcement learning approach for cost-aware translation tool use in LLMs, where the model learns to decide when to translate inputs based on its own comprehension and a cost-sensitivity parameter, achieving Pareto-optimal trade-offs across multiple languages.