asr

Tag

Cards List
#asr

@aigclink: An open-source end-to-end video translation + video Q&A Skill: violin. The highlight is not just literal translation, but the idea of content re-creation. It integrates ASR, LLM translation, and TTS into a seamless pipeline video Skill. The three modules are automatically chained: input a video and get a dubbed translated video. Translation style is adjustable, for example...

X AI KOLs Timeline · 9h ago

Violin is an open-source end-to-end video translation and video Q&A tool, integrating ASR, LLM translation, and TTS. It supports style adjustment and content re-creation, and can answer questions about video content.

0 favorites 0 likes
#asr

A Calculus-Based Framework for Determining Vocabulary Size in End-to-End ASR

arXiv cs.CL · 15h ago Cached

This paper presents a calculus-based framework that uses first and second derivative tests to estimate the optimal vocabulary size hyper-parameter for end-to-end ASR systems, improving performance on the Librispeech corpus.

0 favorites 0 likes
#asr

@berryxia: Guys, this is awesome! Install it right away! Kevin Lin, postdoc at Oxford, former Meta and Microsoft researcher, just released Violin, an open-source video translation Skill. Video is already the absolute dominant content form on the internet. Yet most high-quality lectures, speeches, and podcasts are locked by a single language…

X AI KOLs Timeline · 18h ago Cached

Violin is an open-source video translation tool that integrates speech recognition, large language model translation, and text-to-speech. It supports over 30 languages and offers three usage modes: CLI, web app, and Claude Code.

0 favorites 0 likes
#asr

Vividh-ASR: A Complexity-Tiered Benchmark and Optimization Dynamics for Robust Indic Speech Recognition

Hugging Face Daily Papers · 2d ago Cached

Introduces Vividh-ASR, a complexity-tiered benchmark for Hindi and Malayalam ASR, identifies studio-bias in fine-tuning, and proposes R-MFT to improve spontaneous speech performance efficiently.

0 favorites 0 likes
#asr

Dolphin-CN-Dialect: Where Chinese Dialects Matter

arXiv cs.CL · 3d ago Cached

Dolphin-CN-Dialect is a streaming-capable ASR model that improves dialect recognition through temperature-based sampling and redesigned tokenization, achieving competitive performance with a smaller model size.

0 favorites 0 likes
#asr

Adding Benchmaxxer Repellant to the Open ASR Leaderboard

Hugging Face Blog · 2026-05-06 Cached

Hugging Face announces the addition of private, high-quality datasets from Appen and DataoceanAI to the Open ASR Leaderboard to prevent benchmaxxing and test-set contamination, while maintaining public data for the default average WER calculation.

0 favorites 0 likes
#asr

Voice of India: A Large-Scale Benchmark for Real-World Speech Recognition in India

arXiv cs.CL · 2026-04-22 Cached

Researchers introduce Voice of India, a 536-hour closed benchmark of unscripted telephonic conversations across 15 Indian languages and 139 regional clusters, exposing geographic and demographic ASR performance disparities.

0 favorites 0 likes
#asr

@aigclink: Alibaba Tongyi Lab just dropped Fun-ASR 1.5—one industrial-grade model handles 30 languages, all 7 major Chinese dialect families + 20+ regional accents, even classical-poetry recitation. Dialect CER down 56.2 % vs last gen; 5 dialects top 90 % accuracy…

X AI KOLs Timeline · 2026-04-20 Cached

Alibaba Tongyi Lab releases Fun-ASR 1.5: a single model covering 30 languages, seven Chinese dialect groups and 20+ local accents; character-error rate in key dialect scenarios falls 56.2 %, with five dialects exceeding 90 % accuracy.

0 favorites 0 likes
#asr

BlasBench: An Open Benchmark for Irish Speech Recognition

arXiv cs.CL · 2026-04-20 Cached

BlasBench introduces an open evaluation benchmark for Irish speech recognition with Irish-aware text normalization that preserves linguistic features like fadas, lenition, and eclipsis. The paper benchmarks 12 ASR systems across four architecture families, revealing significant generalization gaps and showing that existing multilingual systems struggle with Irish due to inadequate normalization.

0 favorites 0 likes
← Back to home

Submit Feedback