@yhslgg: Bro, sharing another open-source video translation tool—pyVideoTrans, with 17,700 stars on GitHub, a must-have for video repurposing and localization! In a nutshell: drop a video in, and it automatically runs through the entire pipeline of speech recognition → subtitle translation → AI dubbing → video synthesis, outputting a complete video in another language. Core...

X AI KOLs Timeline Tools

Summary

pyVideoTrans is an open-source video translation tool that supports automatic speech recognition, subtitle translation, AI dubbing, and video synthesis. It integrates multiple ASR, translation, and TTS engines, making it suitable for cross-language video production and localization.

Bro, sharing another open-source video translation tool—pyVideoTrans, with 17,700 stars on GitHub, a must-have for video repurposing and localization! In a nutshell: drop a video in, and it automatically runs through the entire pipeline of speech recognition → subtitle translation → AI dubbing → video synthesis, outputting a complete video in another language. Core advantages one by one: (1) Speaker diarization—handles multi-speaker videos by distinguishing different voices, so subtitles won't mix up. Works for interviews, variety shows, and courses. (2) Voice cloning—integrates F5-TTS, CosyVoice, and GPT-SoVITS, allowing you to clone a specific voice for dubbing. No monotonous machine voices; the result sounds natural and human-like. (3) Pause for manual proofreading at each stage—you can review the recognition results, edit translations, and maintain quality control without fully automating the process. (4) Comprehensive tech stack with flexible choices—ASR supports Faster-Whisper, Alibaba Qwen, Azure, Google; translation supports DeepSeek, Claude, Gemini, ChatGPT, and local Ollama models; TTS includes Edge-TTS (free), OpenAI, Azure, ChatTTS, and ChatterBox. Pick your preferred engine for each step. (5) GPU acceleration—supports NVIDIA CUDA and AMD graphics cards for faster processing, so you don't have to wait. (6) CLI headless mode—supports command-line operation, ideal for server-side batch processing and automated pipelines. (7) Windows portable exe version—no Python setup needed; just download and run. Very user-friendly for non-technical users. Who it's for: those doing cross-language repurposing, adding Chinese subtitles to overseas content, translating Chinese videos into English/Japanese/Korean for global audiences, or turning English courses into Chinese for personal use—efficiency doubled.
Original Article
View Cached Full Text

Cached at: 06/03/26, 07:54 PM

Guys, sharing another open-source video translation tool — pyVideoTrans, 17,700 stars on GitHub. A must-have for video repurposing and localization!

In a nutshell: Drop a video in, it automatically runs through the full pipeline — speech recognition → subtitle translation → AI voiceover → video synthesis — and out comes the complete video in another language.

Core advantages, one by one:

(1) Speaker diarization — Can handle multi-speaker videos, distinguishing different voices so subtitles don’t get mixed up. Works for interviews, variety shows, courses.

(2) Voice cloning — Integrates F5-TTS, CosyVoice, GPT-SoVITS. You can clone a specific voice for dubbing instead of using generic synthetic audio, giving a more natural, human-like result.

(3) Pause for manual review at each stage — Check after recognition, edit after translation. No need to fully trust automation; quality control stays in your hands.

(4) Comprehensive tech stack, easy to swap — ASR supports Faster-Whisper, Alibaba Qwen, Azure, Google; Translation supports DeepSeek, Claude, Gemini, ChatGPT, Ollama local models; TTS offers Edge-TTS (free), OpenAI, Azure, ChatTTS, ChatterBox. Pick your preferred tool for each step.

(5) GPU acceleration — Supports NVIDIA CUDA and AMD GPUs, maxing out processing speed so you don’t waste time waiting.

(6) CLI headless mode — Command-line support, ideal for batch processing on servers, enabling automated pipelines.

(7) Windows portable exe version — No need to set up a Python environment. Just download and run, very user-friendly for casual users.

Who it’s for: Cross-language content repurposers, adding Chinese subtitles to foreign content, taking Chinese videos global by translating into English/Japanese/Korean, or translating English courses into Chinese for personal use — efficiency doubles instantly.

Similar Articles

@yhslgg: Old Yang shares another gem open-source tool—KrillinAI, 10,000 stars on GitHub, a must-see for multilingual audio/video content! In a nutshell: from video download to subtitle translation, AI dubbing, video compositing, the entire pipeline is covered, and it can even auto-generate platform covers, supporting Bilibili, Douyin, Xiaohongshu, YouTube…

X AI KOLs Timeline

KrillinAI is an open-source tool that integrates the entire workflow of video downloading, subtitle translation, AI dubbing, and video compositing. It supports context-aware translation, voice cloning, auto layout, and cover generation, and is compatible with multiple AI models, suitable for multilingual audio/video content creation and distribution.

@berryxia: Guys, this is awesome! Install it right away! Kevin Lin, postdoc at Oxford, former Meta and Microsoft researcher, just released Violin, an open-source video translation Skill. Video is already the absolute dominant content form on the internet. Yet most high-quality lectures, speeches, and podcasts are locked by a single language…

X AI KOLs Timeline

Violin is an open-source video translation tool that integrates speech recognition, large language model translation, and text-to-speech. It supports over 30 languages and offers three usage modes: CLI, web app, and Claude Code.

@rwayne: Video translation has been cracked by a single Oxford postdoc. Kevin Lin, a postdoc at Oxford University, open-sourced Violin, a video translation tool that integrates speech recognition, LLM translation, and speech synthesis into an automated pipeline. It supports multilingual translation, personalized translation styles, and all-in-one video dialogue; it can turn academic reports into children's...

X AI KOLs Timeline

Kevin Lin, a postdoctoral fellow at Oxford University, open-sourced Violin, a video translation tool that integrates speech recognition, LLM translation, and speech synthesis into an automated pipeline. It supports multilingual translation and personalized styles, and provides three usage modes: Web, CLI, and Agent.

@aigclink: An open-source end-to-end video translation + video Q&A Skill: violin. The highlight is not just literal translation, but the idea of content re-creation. It integrates ASR, LLM translation, and TTS into a seamless pipeline video Skill. The three modules are automatically chained: input a video and get a dubbed translated video. Translation style is adjustable, for example...

X AI KOLs Timeline

Violin is an open-source end-to-end video translation and video Q&A tool, integrating ASR, LLM translation, and TTS. It supports style adjustment and content re-creation, and can answer questions about video content.

@Russell3402: Alibaba International's open-source AI-powered fully automated short video engine, Pixelle-Video. Simply input a topic, and it automatically generates a complete short video. From copywriting and voiceovers to image selection and editing, everything is handled by AI. GitHub:

X AI KOLs Timeline

Alibaba International has open-sourced the AI-powered fully automated short video engine Pixelle-Video, allowing users to generate complete short videos—including copywriting, voiceovers, images, and editing—by simply inputting a topic.