Whisperian: It is one of the best applications for Android, if you want to use Mic with some local ASR models. And it is also available on Play Store.
Summary
Whisperian is an Android application that enables users to use a microphone with local automatic speech recognition (ASR) models, and it is available on the Play Store.
Similar Articles
Is Whisper still the best default for speech-to-text if the app needs to be real time?
Explores whether OpenAI's Whisper remains the top choice for real-time speech-to-text applications, considering alternatives and performance trade-offs.
Introducing Whisper
OpenAI introduces Whisper, an end-to-end encoder-decoder Transformer model trained on large-scale diverse audio data for robust multilingual speech recognition, language identification, and speech-to-English translation. Whisper achieves 50% fewer errors than specialized models on diverse datasets and outperforms supervised benchmarks on speech translation despite not being fine-tuned to specific datasets.
@XieZhifei14110: Stop using Whisper for ASR ! open sourcing Mega-ASR — the first full-scenario SOTA industrial-grade ASR model, built fo…
Open sourcing Mega-ASR, a full-scenario SOTA industrial-grade ASR model designed for challenging audio conditions like far-field and noise, outperforming existing open and closed models by 10-30% on real-world benchmarks.
@FeitengLi: Actually, these problems can be well solved: 1. Ditch whisper, switch to an ASR model. Qwen3-ASR is great with few hallucinations, and there are other ASR options. Whisper has many hallucinations and requires 30s segments. Qwen3-ASR gets more accurate with longer audio, supporting up to 20…
Recommends using Qwen3-ASR instead of Whisper to reduce hallucinations, using LattifAI tools for precise audio-text alignment and subtitle generation, and introducing their own OmniVAD-Kit project for voice activity detection.
@HowToAI_: ElevenLabs just lost its moat Someone has open-sourced a single app that replaces ElevenLabs AND WisprFlow and runs 100…
An open-source app called Voicebox replaces ElevenLabs and WisprFlow with local voice cloning, multiple TTS engines, and MCP server support, running on various hardware with MIT license.