Reinforcement Learning Elicits Contextual Learning of Unseen Language Translation

Hugging Face Daily Papers 06/04/26, 12:00 AM Papers

Summary

This paper proposes a reinforcement learning approach to enable large language models to translate unseen languages by leveraging in-context linguistic knowledge, outperforming in-context learning and supervised fine-tuning.

Prior work has shown that large language models (LLMs) can translate unseen or low-resource languages by undergoing continued training or even by encoding a grammar book in their context. However, both methods typically overfit specific languages, with limited zero-shot transfer at test time. To translate extremely low-resource languages at scale, we argue that LLMs must acquire the meta-skill of utilizing in-context linguistic knowledge rather than memorizing specific languages. In this paper, we propose a reinforcement learning (RL) approach to unseen language translation given rich linguistic context, using a surface-level translation metric (chrF) as the reward. Empirically, despite the lightweight reward, our RL-trained models effectively extract and apply relevant linguistic information from the provided context, leading to better translations on completely unseen languages than in-context learning or supervised fine-tuning. Our analyses suggest that outcome-based RL can extend beyond conventional reasoning tasks like math and coding to serve as a recipe for language learning from context.

Original Article

View Cached Full Text

Cached at: 06/05/26, 06:07 AM

Paper page - Reinforcement Learning Elicits Contextual Learning of Unseen Language Translation

Source: https://huggingface.co/papers/2606.06428

Abstract

Reinforcement learning approach enables large language models to translate unseen languages by leveraging in-context linguistic knowledge rather than memorizing specific languages.

Prior work has shown thatlarge language models(LLMs) can translate unseen or low-resource languages by undergoing continued training or even by encoding a grammar book in their context. However, both methods typically overfit specific languages, with limitedzero-shot transferat test time. To translate extremely low-resource languages at scale, we argue that LLMs must acquire themeta-skillof utilizing in-context linguistic knowledge rather than memorizing specific languages. In this paper, we propose areinforcement learning(RL) approach to unseen language translation given richlinguistic context, using a surface-level translation metric (chrF) as the reward. Empirically, despite the lightweight reward, our RL-trained models effectively extract and apply relevant linguistic information from the provided context, leading to better translations on completely unseen languages thanin-context learningorsupervised fine-tuning. Our analyses suggest that outcome-based RL can extend beyond conventional reasoning tasks like math and coding to serve as a recipe for language learning from context.

View arXiv page View PDF GitHub2 Add to collection

Get this paper in your agent:

hf papers read 2606\.06428

Don’t have the latest CLI?curl \-LsSf https://hf\.co/cli/install\.sh \| bash

Models citing this paper0

No model linking this paper

Cite arxiv.org/abs/2606.06428 in a model README.md to link it from this page.

Datasets citing this paper1

#### HanxuHU/rl-new-language Viewer• Updatedabout 2 hours ago • 135k • 71

Spaces citing this paper0

No Space linking this paper

Cite arxiv.org/abs/2606.06428 in a Space README.md to link it from this page.

Reinforcement Learning Elicits Contextual Learning of Unseen Language Translation

Paper page - Reinforcement Learning Elicits Contextual Learning of Unseen Language Translation

Abstract

Models citing this paper0

Datasets citing this paper1

Spaces citing this paper0

Collections including this paper1

Similar Articles

Self-Consolidating Language Models: Continual Knowledge Incorporation from Context

Context-Aware RL for Agentic and Multimodal LLMs

Mid-Training with Self-Generated Data Improves Reinforcement Learning in Language Models

Reinforcement Learning for Evidence-Seeking Diagnostic Reasoning with Large Language Models

Sentence-Level Contextual Entrainment in Large Language Models

Submit Feedback

Similar Articles

Self-Consolidating Language Models: Continual Knowledge Incorporation from Context

Context-Aware RL for Agentic and Multimodal LLMs

Mid-Training with Self-Generated Data Improves Reinforcement Learning in Language Models

Reinforcement Learning for Evidence-Seeking Diagnostic Reasoning with Large Language Models

Sentence-Level Contextual Entrainment in Large Language Models