@Honcia13: Highly recommend an open-source speech-to-subtitle tool! Incredible speed and top-notch quality! Supports multiple languages including Chinese, Japanese, Korean, English, etc., with specially optimized formatting rules for natural and professional subtitles. It's a desktop tool based on PySide6 + ElevenLabs API that can convert audio/video files or JSON…

X AI KOLs Timeline Tools

Summary

Recommend Scribe2SRT, an open-source speech-to-subtitle tool based on PySide6 and ElevenLabs API, supporting multiple languages with optimized formatting for fast generation of high-quality SRT subtitles.

Highly recommend an open-source speech-to-subtitle tool! Incredible speed and top-notch quality! Supports multiple languages including Chinese, Japanese, Korean, English, etc., with specially optimized formatting rules for natural and professional subtitles. It's a desktop tool based on PySide6 + ElevenLabs API that can intelligently convert audio/video files or JSON transcripts into high-quality SRT subtitles, especially suitable for the formatting conventions of Chinese, Japanese, Korean, and English. For those who make videos, edit, create courseware, or produce subtitles, this is really awesome! https://github.com/cylind/scribe2srt…
Original Article
View Cached Full Text

Cached at: 05/31/26, 07:03 AM

Strongly recommend an open-source speech-to-subtitle tool! Extremely fast with top-notch quality! Supports multiple languages including Chinese, Japanese, Korean, English, and more, with specially optimized formatting rules for natural and professional subtitles. This is a desktop tool based on PySide6 + ElevenLabs API that intelligently converts audio/video files or JSON transcripts into high-quality SRT subtitles, especially suited for CJK and English formatting conventions. Perfect for video creators, editors, course makers, and subtitle producers! https://github.com/cylind/scribe2srt… — # cylind/scribe2srt Source: https://github.com/cylind/scribe2srt # Scribe2SRT Scribe2SRT is a professional audio/video to subtitle tool. By integrating ElevenLabs speech recognition technology and intelligent subtitle segmentation algorithms, subtitle production becomes simple and efficient. ## 🚀 Key Features - 🎯 High-quality transcription: Based on ElevenLabs’ advanced speech recognition technology - 🌍 Multi-language support: Supports Chinese, English, Japanese, Korean, and more - 📝 Professional subtitle standards: Follows industry standards like Netflix for subtitle production - ⚡ Smart segmentation algorithm: Semantic segmentation based on punctuation priority, maintaining sentence integrity - 🔄 Smart retry mechanism: Automatically saves temporary files on failure, quick recovery on retry - 🎨 User-friendly interface: Clean and intuitive graphical user interface with drag-and-drop support - 📊 Real-time progress feedback: Clear progress display and status prompts ## 💻 Installation & Usage ### Quick Start 1. Go to the Releases page (https://github.com/cylind/scribe2srt/releases) and download the latest version 2. Extract and run the program directly 3. It is recommended to install FFmpeg: For video file processing, improving compatibility and efficiency Run from source (click to expand) #### Installation Steps 1. Download the project bash git clone https://github.com/your-username/scribe2srt.git cd scribe2srt 2. Install dependencies bash pip install -r requirements.txt 3. Run the program bash python app.py ## 📖 Usage Instructions ### Basic Workflow 1. Select input file - Click the “Select File” button or drag and drop a file into the program window - Supports three input types: - Audio files: All common audio formats (MP3, WAV, FLAC, M4A, AAC, OGG, etc.) - Video files: All common video formats (MP4, MOV, MKV, AVI, FLV, WEBM, etc.) - JSON transcript files: ElevenLabs format transcription data 2. Configure processing options - Language selection: Select the source language or use “Auto Detect” - Audio event marking: Choose whether to mark non-speech events (e.g., laughter, applause) 3. Start processing - Click the “Generate Subtitles” button to begin transcription - The program will display detailed processing progress 4. Get results - After processing, the SRT subtitle file is automatically saved in the same directory as the source file - The program shows the output file path ### Subtitle Quality Standards This tool follows professional subtitle production standards: - Duration control: Minimum 0.83 seconds, maximum 7.0 seconds - Character density: CJK languages up to 11 characters per second, Latin languages up to 15 characters per second - Line length limit: CJK languages max 25 characters per line, Latin languages max 42 characters per line - Semantic integrity: Prioritizes maintaining sentence completeness, segmentation based on punctuation priority ## ⚙️ Advanced Settings ### Subtitle Parameter Adjustment Via the “Subtitle Settings” menu, you can adjust: - Subtitle display duration and gap - Character density limits - Characters per line limit ### Large File Processing - Automatic segmentation for long files (90+ minutes) - Supports concurrent processing for faster speed - Smart retry mechanism ensures processing success ## 🔧 Technical Highlights ### Smart Segmentation Algorithm - Two-stage processing: Sentence pre-segmentation + intelligent merging - Punctuation priority: Segmentation strategy based on linguistic rules - Semantic integrity: Avoids breaking sentence structure - Multi-language optimization: Differentiated processing for different languages ## 📄 License This project is licensed under the MIT license. — If this project helps you, please give us a ⭐ Star!

Similar Articles

@yhslgg: Bro, sharing another open-source video translation tool—pyVideoTrans, with 17,700 stars on GitHub, a must-have for video repurposing and localization! In a nutshell: drop a video in, and it automatically runs through the entire pipeline of speech recognition → subtitle translation → AI dubbing → video synthesis, outputting a complete video in another language. Core...

X AI KOLs Timeline

pyVideoTrans is an open-source video translation tool that supports automatic speech recognition, subtitle translation, AI dubbing, and video synthesis. It integrates multiple ASR, translation, and TTS engines, making it suitable for cross-language video production and localization.

@wsl8297: Want to turn ebooks or documents into audiobooks? Many tools sound too robotic or lack subtitle sync, leaving you frustrated. Then I found the open-source project Abogen: it supports ePub, PDF, plain text, etc., one-click conversion to high-quality audio with auto-generated synchronized subtitles. It uses Kokoro voice at its core…

X AI KOLs Timeline

Abogen is an open-source tool that can convert documents like ePub and PDF into high-quality audio with one click, automatically generating synchronized subtitles. It supports a voice mixer and multiple deployment methods.

@yhslgg: Old Yang shares another gem open-source tool—KrillinAI, 10,000 stars on GitHub, a must-see for multilingual audio/video content! In a nutshell: from video download to subtitle translation, AI dubbing, video compositing, the entire pipeline is covered, and it can even auto-generate platform covers, supporting Bilibili, Douyin, Xiaohongshu, YouTube…

X AI KOLs Timeline

KrillinAI is an open-source tool that integrates the entire workflow of video downloading, subtitle translation, AI dubbing, and video compositing. It supports context-aware translation, voice cloning, auto layout, and cover generation, and is compatible with multiple AI models, suitable for multilingual audio/video content creation and distribution.

@noahduck283: A tool that can download any YouTube video, cleanly remove vocals, transcribe, translate into 100+ languages, clone the original voice, and perform fully automatic dubbing. It takes less than 2 minutes. 100% runs locally. Free. Sews six top open-source models into a web page for "one-click download, vocal removal, transcription, translation, dubbing"...

X AI KOLs Timeline

Voice-Pro is a web tool that integrates six top open-source models (Whisper, Demucs, CosyVoice, F5-TTS, etc.), supporting YouTube video downloading, vocal removal, transcription, translation, voice cloning, and fully automatic dubbing. It takes less than 2 minutes, runs 100% locally, and is free.

@VincentLogic: Found an incredible open-source desktop AI tool from ByteDance! UI-TARS Desktop, with 31k stars, truly lives up to the hype. It can actually understand your screen and automate computer operations for you. Just tell it "Enable auto-save in VS Code and set the delay to 500ms", and it will automatically: -…

X AI KOLs Timeline

ByteDance's open-source desktop AI automation tool, UI-TARS Desktop, supports local execution and screen visual understanding. It can autonomously control your computer to handle daily tasks through natural language commands.