long-form

#long-form

SwanVoice: Expressive Long-Form Zero-Shot Speech Synthesis for Both Monologue and Dialogue

Hugging Face Daily Papers ↗ · 2026-05-29 Cached

SwanVoice is a zero-shot text-to-speech model designed for expressive long-form monologue and dialogue synthesis, combining VAE, flow-matching DiT, and diffusion post-training to achieve higher richness and hierarchy scores than existing baselines.

0 favorites 0 likes

#long-form

Comprehensive Benchmarking of Long-Form Speech Generation in Diverse Scenarios

Hugging Face Daily Papers ↗ · 2026-05-27 Cached

Swanbench-Speech is a comprehensive benchmark for evaluating long-form speech generation across diverse scenarios, using multi-dimensional metrics covering acoustics, semantics, and expressiveness, revealing limitations of current models.

0 favorites 0 likes

#long-form

When Reasoning Supervision Hurts: TTCW-Based Long-Form Literary Review Generation

arXiv cs.CL ↗ · 2026-05-21 Cached

This paper constructs a large dataset of 263,911 long-form stories annotated with TTCW-based creativity metrics and fine-tunes Qwen3 models to generate structured review reports. It finds that non-reasoning fine-tuning outperforms reasoning-supervised fine-tuning, which suffers from parse failures and irrelevant repetition.

0 favorites 0 likes

long-form

SwanVoice: Expressive Long-Form Zero-Shot Speech Synthesis for Both Monologue and Dialogue

Comprehensive Benchmarking of Long-Form Speech Generation in Diverse Scenarios

When Reasoning Supervision Hurts: TTCW-Based Long-Form Literary Review Generation

Submit Feedback