talkie-lm/talkie-1930-13b-it

Hugging Face Models Trending 04/20/26, 10:43 AM Models

ai-model instruction-tuning hugging-face reinforcement-learning dpo vintage-dataset

Summary

Talkie-1930-13b-it is a 13B parameter instruction-tuned language model trained on pre-1931 text and fine-tuned using reinforcement learning with DPO.

Tags: en, base_model:talkie-lm/talkie-1930-13b-base, base_model:finetune:talkie-lm/talkie-1930-13b-base, license:apache-2.0, region:us

Original Article Export to Word Export to PDF

View Cached Full Text

Cached at: 05/08/26, 09:06 AM

talkie-lm/talkie-1930-13b-it · Hugging Face

Source: https://huggingface.co/talkie-lm/talkie-1930-13b-it talkie-1930-13b-it is a 13B vintage language model. It is an instruction-tuned post-train of talkie-1930-13b-base, which was trained on 260B tokens of pre-1931 English-language text.

talkie-1930-13b-it was finetuned using a novel dataset of instruction-response pairs extracted from pre-1931 reference works, including etiquette manuals, encyclopedias, and letter-writing manuals. The model then underwent reinforcement learning (online DPO with an LLM-as-a-judge) to improve instruction-following ability.

Similar Articles

Decomposing the Basic Abilities of Large Language Models: Mitigating Cross-Task Interference in Multi-Task Instruct-Tuning

arXiv cs.CL

This paper proposes Badit, a method that decomposes large language model parameters into orthogonal high-singular-value LoRA experts to mitigate cross-task interference during multi-task instruction tuning.

Token Statistics Reveal Conversational Drift in Multi-turn LLM Interaction

arXiv cs.CL

This paper introduces Bipredictability (P) and the Information Digital Twin (IDT), a lightweight method to monitor conversational consistency in multi-turn LLM interactions using token frequency statistics without embeddings or model internals. The approach achieves 100% sensitivity in detecting contradictions and topic shifts while establishing a practical monitoring framework for extended LLM deployments.

Cross-Tokenizer LLM Distillation through a Byte-Level Interface

Hugging Face Daily Papers

This paper proposes Byte-Level Distillation (BLD), a simple method for cross-tokenizer knowledge transfer in language models by operating at a shared byte-level interface, achieving competitive or superior performance compared to more complex existing approaches across 1B-8B parameter models.

When2Speak: A Dataset for Temporal Participation and Turn-Taking in Multi-Party Conversations for Large Language Models

arXiv cs.CL

When2Speak is a synthetic dataset and pipeline for training LLMs to decide when to speak in multi-party conversations. Fine-tuning on this dataset significantly improves turn-taking, with reinforcement learning reducing missed interventions from 50% to ~20%.

openbmb/VoxCPM2

Hugging Face Models Trending

VoxCPM2 is an open-source, tokenizer-free diffusion autoregressive Text-to-Speech model supporting 30 languages with 2B parameters, 48kHz audio output, and features including voice design from natural language descriptions, controllable voice cloning, and real-time streaming capabilities.

talkie-lm/talkie-1930-13b-it · Hugging Face

Similar Articles

Decomposing the Basic Abilities of Large Language Models: Mitigating Cross-Task Interference in Multi-Task Instruct-Tuning

Token Statistics Reveal Conversational Drift in Multi-turn LLM Interaction

Cross-Tokenizer LLM Distillation through a Byte-Level Interface

When2Speak: A Dataset for Temporal Participation and Turn-Taking in Multi-Party Conversations for Large Language Models

openbmb/VoxCPM2

Submit Feedback