joint-training

#joint-training

Rethinking Speech-LLM Integration for ASR: Effective Joint Speech-Text Training by Interleaving

arXiv cs.CL ↗ · 2026-07-03 Cached

This paper proposes Joint Speech-Text Interleaved Pretraining (JSTIP), a pretraining strategy that constructs word-level and segment-level interleaved speech-text sequences to improve ASR entity accuracy and reduce the modality gap between speech and text, showing competitive performance on domain adaptation and zero-shot speech question answering.

0 favorites 0 likes

#joint-training

Chronicle: A Multimodal Foundation Model for Joint Language and Time Series Understanding

arXiv cs.LG ↗ · 2026-05-21 Cached

Chronicle is a 324M-parameter decoder-only transformer pretrained from scratch on both natural language and time series, achieving competitive performance on NLU and time series classification tasks, and setting new state-of-the-art for frozen-embedding time series classification on UCR/UEA datasets.

0 favorites 0 likes

#joint-training

ICRL: Learning to Internalize Self-Critique with Reinforcement Learning

arXiv cs.AI ↗ · 2026-05-18 Cached

This paper introduces ICRL, a framework that jointly trains a solver and critic with reinforcement learning to internalize critique guidance, enabling the solver to improve without external critique. It uses distribution calibration and role-wise group advantage estimation, achieving 6-7 point gains over GRPO on agentic and mathematical reasoning tasks.

0 favorites 0 likes

joint-training

Rethinking Speech-LLM Integration for ASR: Effective Joint Speech-Text Training by Interleaving

Chronicle: A Multimodal Foundation Model for Joint Language and Time Series Understanding

ICRL: Learning to Internalize Self-Critique with Reinforcement Learning

Submit Feedback