@FeitengLi: Led by Fable 5 (just half a day), Codex relay development took a week. #EdgeSpeak is now live. Friends who shared, contact me to receive an invite code https://edgespeak.com/zh

X AI KOLs Timeline 06/21/26, 10:37 AM Products

voice-ai local-transcription desktop-tool lattice-2 speech-recognition product-launch edge-computing

Summary

EdgeSpeak desktop voice transcription tool is now live, featuring the local Lattice-2 voice model. It supports offline audio/video transcription, multiple languages and accents, and provides a local API for developers to integrate.

Led by Fable 5 (just half a day), Codex relay development took a week. #EdgeSpeak is now live. Friends who shared, contact me to receive an invite code. https://t.co/HquVtvEK9n https://t.co/stsA7xYV9D

Original Article

View Cached Full Text

Cached at: 06/22/26, 01:41 AM

Led by Fable 5 (just half a day) with Codex relay development over a week, #EdgeSpeak officially launched. Friends who share, contact me for an invite code.

https://t.co/HquVtvEK9n https://t.co/stsA7xYV9D

Put Voice AI Models into Your Computer | EdgeSpeak

Source: https://edgespeak.com/ On-Device Speech Engine

EdgeSpeak optimizes and compresses professional speech models for desktop use. Supports meetings, interviews, videos, or new recordings for local transcription; audio, video, and transcripts stay on your device.

Let your computer understand meetings, interviews, and videos itself.

local-transcribe.mov

00:04 Drag in a meeting, interview, video, or start a new recording.

00:12 Lattice-2 transcribes text locally, view as it runs.

00:27 Proofread and export, or pass to other tools.

Dedicated On-Device Large Model

Lattice-2: A Speech Large Model Deeply Optimized for Desktop

Lattice-2 is the local speech model series running on EdgeSpeak. Through compression, inference optimization, and local compute adaptation, it enables regular PCs to process meetings, interviews, videos, and recordings.

On-Device Compression and Inference Adaptation

Tailors speech AI models for desktop operation, reducing wait times — local transcription with no external service dependency.

Flash / Pro Dual Model Synergy

Flash is designed for everyday meetings and video transcription, offering faster responses. Pro targets challenging audio, complex accents, and higher accuracy needs, using slightly more local resources.

Integrated into Developer Workflows

Supports 40+ languages, multiple English accents, and Chinese dialects. The same local engine can be accessed by CLI, agents, and automation via a gateway.

Real Desktop Experience

Lattice-2: Already Running in EdgeSpeak Desktop

Current desktop features: Import audio or video, select the Lattice-2 model, export results. For automation, the speech gateway enables efficient collaboration with agents.

Audio, Video, and Transcripts in One Workspace Playback, timeline, transcripts, export, recent files, and current Lattice model are all together. Fewer tool switches, less context loss. EdgeSpeak desktop transcription main interface showing the main transcription workspace, local Lattice model status, transcript content, playback controls, and recent files.

Select Lattice Model by Task Lattice-2 supports 40+ languages, multiple English accents, and Chinese dialects. Flash is fast with good results; Pro is more accurate but uses more local resources. EdgeSpeak desktop model page showing local Lattice-2 Flash and Lattice-2 Pro options.

Let Other Tools Use Lattice Too CLI, agents, and automation can send audio to EdgeSpeak and receive completed transcripts from Lattice. EdgeSpeak desktop gateway page showing how other tools on the same computer can use the local speech engine.

Early Bird

Early Bird $29: Put Lattice into Your Workflow

The current version already uses Lattice-2 for local audio/video transcription and connects via the local gateway to CLI, agents, and automation. Lifetime license includes future model and speech capability updates: buy what works now, and grow together as it evolves.

Early Bird Price

Lifetime License

$29 $99 One-time payment

Currently $29, regular price $99.

Local audio/video transcription available now
Local gateway and CLI available now
Future model and speech capability updates
Up to 4 devices

For more devices or team purchases: [email protected]

Local Speech Gateway

Let Agents and Automation Tools Directly Call Lattice-2 on This Computer

EdgeSpeak wraps Lattice-2 into a local speech API on the same machine. CLI, agents, automation scripts, and tools compatible with the OpenAI transcription API can send audio to the local engine and get transcripts back. It’s not another cloud — it’s a speech gateway inside your computer.

Local OpenAI-Compatible Interface CLI / Agent / Automation Lattice-2 Flash and Pro

edgespeak gateway - 127.0.0.1:1117

curl http://127.0.0.1:1117/v1/audio/transcriptions \ -H "Authorization: Bearer sk-edgespeak-..." \ -F [email protected] \ -F model="lattice-2-flash"

Feiteng (@FeitengLi): Bookmarks exceeded 1,000 — seems Twitter friends have long suffered from inaccurate recognition and unstable timestamps.

Few have nailed: accurate recognition + stable timestamps + speaker labeling.

Maybe I should practice and write a desktop app: ~1GB memory, 40x real-time on M4 (1 minute transcribes 40 minutes of audio).

Desktop App: drag-and-drop transcription, direct mic capture, meeting notes Compatible with OpenAI Audio API, can take over OpenClaw

@FeitengLi: Led by Fable 5 (just half a day), Codex relay development took a week. #EdgeSpeak is now live. Friends who shared, contact me to receive an invite code https://edgespeak.com/zh

Put Voice AI Models into Your Computer | EdgeSpeak

Lattice-2: A Speech Large Model Deeply Optimized for Desktop

On-Device Compression and Inference Adaptation

Flash / Pro Dual Model Synergy

Integrated into Developer Workflows

Lattice-2: Already Running in EdgeSpeak Desktop

Early Bird $29: Put Lattice into Your Workflow

Lifetime License

Let Agents and Automation Tools Directly Call Lattice-2 on This Computer

Similar Articles

@iluciddreaming: Google just killed another startup... Google AI Edge Eloquent now supports Mac, a fully local Wispr Flow alternative. Based on the latest Gemma model, supports real-time voice transcription + voice commands to edit text. Free, no subscription, no...

@dotey: https://x.com/dotey/status/2057250417638035555

@GitTrend0x: Holy cow, guys! Run voice cloning and cinematic video dubbing locally, supporting 646 languages, fully offline, no API key, no internet needed. ElevenLabs is crushed! https://github.com/debpalash/OmniVoice-Studio… This open-source marvel is insane...

Submit Feedback

Similar Articles

@uniswap12: Microsoft open-sourced a voice AI that can transcribe 60 minutes of long audio in one go, handling 4 people speaking simultaneously. VibeVoice, open-sourced by Microsoft, 24.8k stars, I only found out about it today. For converting recordings to text, I've been using Whisper, but it often times out on long meeting recordings and struggles with multi-speaker recognition...

@iluciddreaming: Google just killed another startup... Google AI Edge Eloquent now supports Mac, a fully local Wispr Flow alternative. Based on the latest Gemma model, supports real-time voice transcription + voice commands to edit text. Free, no subscription, no...

@dotey: https://x.com/dotey/status/2057250417638035555

@GitTrend0x: Holy cow, guys! Run voice cloning and cinematic video dubbing locally, supporting 646 languages, fully offline, no API key, no internet needed. ElevenLabs is crushed! https://github.com/debpalash/OmniVoice-Studio… This open-source marvel is insane...

@noahduck283: A tool that can download any YouTube video, cleanly remove vocals, transcribe, translate into 100+ languages, clone the original voice, and perform fully automatic dubbing. It takes less than 2 minutes. 100% runs locally. Free. Sews six top open-source models into a web page for "one-click download, vocal removal, transcription, translation, dubbing"...