@yhslgg: Bro, sharing another open-source video translation tool—pyVideoTrans, with 17,700 stars on GitHub, a must-have for video repurposing and localization! In a nutshell: drop a video in, and it automatically runs through the entire pipeline of speech recognition → subtitle translation → AI dubbing → video synthesis, outputting a complete video in another language. Core...
Summary
pyVideoTrans is an open-source video translation tool that supports automatic speech recognition, subtitle translation, AI dubbing, and video synthesis. It integrates multiple ASR, translation, and TTS engines, making it suitable for cross-language video production and localization.
View Cached Full Text
Cached at: 06/03/26, 07:54 PM
Guys, sharing another open-source video translation tool — pyVideoTrans, 17,700 stars on GitHub. A must-have for video repurposing and localization!
In a nutshell: Drop a video in, it automatically runs through the full pipeline — speech recognition → subtitle translation → AI voiceover → video synthesis — and out comes the complete video in another language.
Core advantages, one by one:
(1) Speaker diarization — Can handle multi-speaker videos, distinguishing different voices so subtitles don’t get mixed up. Works for interviews, variety shows, courses.
(2) Voice cloning — Integrates F5-TTS, CosyVoice, GPT-SoVITS. You can clone a specific voice for dubbing instead of using generic synthetic audio, giving a more natural, human-like result.
(3) Pause for manual review at each stage — Check after recognition, edit after translation. No need to fully trust automation; quality control stays in your hands.
(4) Comprehensive tech stack, easy to swap — ASR supports Faster-Whisper, Alibaba Qwen, Azure, Google; Translation supports DeepSeek, Claude, Gemini, ChatGPT, Ollama local models; TTS offers Edge-TTS (free), OpenAI, Azure, ChatTTS, ChatterBox. Pick your preferred tool for each step.
(5) GPU acceleration — Supports NVIDIA CUDA and AMD GPUs, maxing out processing speed so you don’t waste time waiting.
(6) CLI headless mode — Command-line support, ideal for batch processing on servers, enabling automated pipelines.
(7) Windows portable exe version — No need to set up a Python environment. Just download and run, very user-friendly for casual users.
Who it’s for: Cross-language content repurposers, adding Chinese subtitles to foreign content, taking Chinese videos global by translating into English/Japanese/Korean, or translating English courses into Chinese for personal use — efficiency doubles instantly.
Similar Articles
@yhslgg: Old Yang shares another gem open-source tool—KrillinAI, 10,000 stars on GitHub, a must-see for multilingual audio/video content! In a nutshell: from video download to subtitle translation, AI dubbing, video compositing, the entire pipeline is covered, and it can even auto-generate platform covers, supporting Bilibili, Douyin, Xiaohongshu, YouTube…
KrillinAI is an open-source tool that integrates the entire workflow of video downloading, subtitle translation, AI dubbing, and video compositing. It supports context-aware translation, voice cloning, auto layout, and cover generation, and is compatible with multiple AI models, suitable for multilingual audio/video content creation and distribution.
@berryxia: Guys, this is awesome! Install it right away! Kevin Lin, postdoc at Oxford, former Meta and Microsoft researcher, just released Violin, an open-source video translation Skill. Video is already the absolute dominant content form on the internet. Yet most high-quality lectures, speeches, and podcasts are locked by a single language…
Violin is an open-source video translation tool that integrates speech recognition, large language model translation, and text-to-speech. It supports over 30 languages and offers three usage modes: CLI, web app, and Claude Code.
@rwayne: Video translation has been cracked by a single Oxford postdoc. Kevin Lin, a postdoc at Oxford University, open-sourced Violin, a video translation tool that integrates speech recognition, LLM translation, and speech synthesis into an automated pipeline. It supports multilingual translation, personalized translation styles, and all-in-one video dialogue; it can turn academic reports into children's...
Kevin Lin, a postdoctoral fellow at Oxford University, open-sourced Violin, a video translation tool that integrates speech recognition, LLM translation, and speech synthesis into an automated pipeline. It supports multilingual translation and personalized styles, and provides three usage modes: Web, CLI, and Agent.
@aigclink: An open-source end-to-end video translation + video Q&A Skill: violin. The highlight is not just literal translation, but the idea of content re-creation. It integrates ASR, LLM translation, and TTS into a seamless pipeline video Skill. The three modules are automatically chained: input a video and get a dubbed translated video. Translation style is adjustable, for example...
Violin is an open-source end-to-end video translation and video Q&A tool, integrating ASR, LLM translation, and TTS. It supports style adjustment and content re-creation, and can answer questions about video content.
@Russell3402: Alibaba International's open-source AI-powered fully automated short video engine, Pixelle-Video. Simply input a topic, and it automatically generates a complete short video. From copywriting and voiceovers to image selection and editing, everything is handled by AI. GitHub:
Alibaba International has open-sourced the AI-powered fully automated short video engine Pixelle-Video, allowing users to generate complete short videos—including copywriting, voiceovers, images, and editing—by simply inputting a topic.