@yhslgg: Bro, sharing another open-source video translation tool—pyVideoTrans, with 17,700 stars on GitHub, a must-have for video repurposing and localization! In a nutshell: drop a video in, and it automatically runs through the entire pipeline of speech recognition → subtitle translation → AI dubbing → video synthesis, outputting a complete video in another language. Core...

X AI KOLs Timeline 06/03/26, 04:07 AM Tools

open-source video-translation tts asr subtitle dubbing gpu-acceleration

Summary

pyVideoTrans is an open-source video translation tool that supports automatic speech recognition, subtitle translation, AI dubbing, and video synthesis. It integrates multiple ASR, translation, and TTS engines, making it suitable for cross-language video production and localization.

Bro, sharing another open-source video translation tool—pyVideoTrans, with 17,700 stars on GitHub, a must-have for video repurposing and localization! In a nutshell: drop a video in, and it automatically runs through the entire pipeline of speech recognition → subtitle translation → AI dubbing → video synthesis, outputting a complete video in another language. Core advantages one by one: (1) Speaker diarization—handles multi-speaker videos by distinguishing different voices, so subtitles won't mix up. Works for interviews, variety shows, and courses. (2) Voice cloning—integrates F5-TTS, CosyVoice, and GPT-SoVITS, allowing you to clone a specific voice for dubbing. No monotonous machine voices; the result sounds natural and human-like. (3) Pause for manual proofreading at each stage—you can review the recognition results, edit translations, and maintain quality control without fully automating the process. (4) Comprehensive tech stack with flexible choices—ASR supports Faster-Whisper, Alibaba Qwen, Azure, Google; translation supports DeepSeek, Claude, Gemini, ChatGPT, and local Ollama models; TTS includes Edge-TTS (free), OpenAI, Azure, ChatTTS, and ChatterBox. Pick your preferred engine for each step. (5) GPU acceleration—supports NVIDIA CUDA and AMD graphics cards for faster processing, so you don't have to wait. (6) CLI headless mode—supports command-line operation, ideal for server-side batch processing and automated pipelines. (7) Windows portable exe version—no Python setup needed; just download and run. Very user-friendly for non-technical users. Who it's for: those doing cross-language repurposing, adding Chinese subtitles to overseas content, translating Chinese videos into English/Japanese/Korean for global audiences, or turning English courses into Chinese for personal use—efficiency doubled.

Original Article

View Cached Full Text

Cached at: 06/03/26, 07:54 PM

Guys, sharing another open-source video translation tool — pyVideoTrans, 17,700 stars on GitHub. A must-have for video repurposing and localization!

In a nutshell: Drop a video in, it automatically runs through the full pipeline — speech recognition → subtitle translation → AI voiceover → video synthesis — and out comes the complete video in another language.

Core advantages, one by one:

(1) Speaker diarization — Can handle multi-speaker videos, distinguishing different voices so subtitles don’t get mixed up. Works for interviews, variety shows, courses.

(2) Voice cloning — Integrates F5-TTS, CosyVoice, GPT-SoVITS. You can clone a specific voice for dubbing instead of using generic synthetic audio, giving a more natural, human-like result.

(3) Pause for manual review at each stage — Check after recognition, edit after translation. No need to fully trust automation; quality control stays in your hands.

(4) Comprehensive tech stack, easy to swap — ASR supports Faster-Whisper, Alibaba Qwen, Azure, Google; Translation supports DeepSeek, Claude, Gemini, ChatGPT, Ollama local models; TTS offers Edge-TTS (free), OpenAI, Azure, ChatTTS, ChatterBox. Pick your preferred tool for each step.

(5) GPU acceleration — Supports NVIDIA CUDA and AMD GPUs, maxing out processing speed so you don’t waste time waiting.

(6) CLI headless mode — Command-line support, ideal for batch processing on servers, enabling automated pipelines.

(7) Windows portable exe version — No need to set up a Python environment. Just download and run, very user-friendly for casual users.

Who it’s for: Cross-language content repurposers, adding Chinese subtitles to foreign content, taking Chinese videos global by translating into English/Japanese/Korean, or translating English courses into Chinese for personal use — efficiency doubles instantly.

Similar Articles

@Russell3402: Alibaba International's open-source AI-powered fully automated short video engine, Pixelle-Video. Simply input a topic, and it automatically generates a complete short video. From copywriting and voiceovers to image selection and editing, everything is handled by AI. GitHub:

Submit Feedback

Similar Articles

@yhslgg: Old Yang shares another gem open-source tool—KrillinAI, 10,000 stars on GitHub, a must-see for multilingual audio/video content! In a nutshell: from video download to subtitle translation, AI dubbing, video compositing, the entire pipeline is covered, and it can even auto-generate platform covers, supporting Bilibili, Douyin, Xiaohongshu, YouTube…

@berryxia: Guys, this is awesome! Install it right away! Kevin Lin, postdoc at Oxford, former Meta and Microsoft researcher, just released Violin, an open-source video translation Skill. Video is already the absolute dominant content form on the internet. Yet most high-quality lectures, speeches, and podcasts are locked by a single language…

@rwayne: Video translation has been cracked by a single Oxford postdoc. Kevin Lin, a postdoc at Oxford University, open-sourced Violin, a video translation tool that integrates speech recognition, LLM translation, and speech synthesis into an automated pipeline. It supports multilingual translation, personalized translation styles, and all-in-one video dialogue; it can turn academic reports into children's...

@aigclink: An open-source end-to-end video translation + video Q&A Skill: violin. The highlight is not just literal translation, but the idea of content re-creation. It integrates ASR, LLM translation, and TTS into a seamless pipeline video Skill. The three modules are automatically chained: input a video and get a dubbed translated video. Translation style is adjustable, for example...

@Russell3402: Alibaba International's open-source AI-powered fully automated short video engine, Pixelle-Video. Simply input a topic, and it automatically generates a complete short video. From copywriting and voiceovers to image selection and editing, everything is handled by AI. GitHub: