@Honcia13: Highly recommend an open-source speech-to-subtitle tool! Incredible speed and top-notch quality! Supports multiple languages including Chinese, Japanese, Korean, English, etc., with specially optimized formatting rules for natural and professional subtitles. It's a desktop tool based on PySide6 + ElevenLabs API that can convert audio/video files or JSON…
Summary
Recommend Scribe2SRT, an open-source speech-to-subtitle tool based on PySide6 and ElevenLabs API, supporting multiple languages with optimized formatting for fast generation of high-quality SRT subtitles.
View Cached Full Text
Cached at: 05/31/26, 07:03 AM
Strongly recommend an open-source speech-to-subtitle tool! Extremely fast with top-notch quality! Supports multiple languages including Chinese, Japanese, Korean, English, and more, with specially optimized formatting rules for natural and professional subtitles. This is a desktop tool based on PySide6 + ElevenLabs API that intelligently converts audio/video files or JSON transcripts into high-quality SRT subtitles, especially suited for CJK and English formatting conventions. Perfect for video creators, editors, course makers, and subtitle producers! https://github.com/cylind/scribe2srt… — # cylind/scribe2srt Source: https://github.com/cylind/scribe2srt # Scribe2SRT Scribe2SRT is a professional audio/video to subtitle tool. By integrating ElevenLabs speech recognition technology and intelligent subtitle segmentation algorithms, subtitle production becomes simple and efficient. ## 🚀 Key Features - 🎯 High-quality transcription: Based on ElevenLabs’ advanced speech recognition technology - 🌍 Multi-language support: Supports Chinese, English, Japanese, Korean, and more - 📝 Professional subtitle standards: Follows industry standards like Netflix for subtitle production - ⚡ Smart segmentation algorithm: Semantic segmentation based on punctuation priority, maintaining sentence integrity - 🔄 Smart retry mechanism: Automatically saves temporary files on failure, quick recovery on retry - 🎨 User-friendly interface: Clean and intuitive graphical user interface with drag-and-drop support - 📊 Real-time progress feedback: Clear progress display and status prompts ## 💻 Installation & Usage ### Quick Start 1. Go to the Releases page (https://github.com/cylind/scribe2srt/releases) and download the latest version 2. Extract and run the program directly 3. It is recommended to install FFmpeg: For video file processing, improving compatibility and efficiency Run from source (click to expand) #### Installation Steps 1. Download the project bash git clone https://github.com/your-username/scribe2srt.git cd scribe2srt 2. Install dependencies bash pip install -r requirements.txt 3. Run the program bash python app.py ## 📖 Usage Instructions ### Basic Workflow 1. Select input file - Click the “Select File” button or drag and drop a file into the program window - Supports three input types: - Audio files: All common audio formats (MP3, WAV, FLAC, M4A, AAC, OGG, etc.) - Video files: All common video formats (MP4, MOV, MKV, AVI, FLV, WEBM, etc.) - JSON transcript files: ElevenLabs format transcription data 2. Configure processing options - Language selection: Select the source language or use “Auto Detect” - Audio event marking: Choose whether to mark non-speech events (e.g., laughter, applause) 3. Start processing - Click the “Generate Subtitles” button to begin transcription - The program will display detailed processing progress 4. Get results - After processing, the SRT subtitle file is automatically saved in the same directory as the source file - The program shows the output file path ### Subtitle Quality Standards This tool follows professional subtitle production standards: - Duration control: Minimum 0.83 seconds, maximum 7.0 seconds - Character density: CJK languages up to 11 characters per second, Latin languages up to 15 characters per second - Line length limit: CJK languages max 25 characters per line, Latin languages max 42 characters per line - Semantic integrity: Prioritizes maintaining sentence completeness, segmentation based on punctuation priority ## ⚙️ Advanced Settings ### Subtitle Parameter Adjustment Via the “Subtitle Settings” menu, you can adjust: - Subtitle display duration and gap - Character density limits - Characters per line limit ### Large File Processing - Automatic segmentation for long files (90+ minutes) - Supports concurrent processing for faster speed - Smart retry mechanism ensures processing success ## 🔧 Technical Highlights ### Smart Segmentation Algorithm - Two-stage processing: Sentence pre-segmentation + intelligent merging - Punctuation priority: Segmentation strategy based on linguistic rules - Semantic integrity: Avoids breaking sentence structure - Multi-language optimization: Differentiated processing for different languages ## 📄 License This project is licensed under the MIT license. — If this project helps you, please give us a ⭐ Star!
Similar Articles
@yhslgg: Bro, sharing another open-source video translation tool—pyVideoTrans, with 17,700 stars on GitHub, a must-have for video repurposing and localization! In a nutshell: drop a video in, and it automatically runs through the entire pipeline of speech recognition → subtitle translation → AI dubbing → video synthesis, outputting a complete video in another language. Core...
pyVideoTrans is an open-source video translation tool that supports automatic speech recognition, subtitle translation, AI dubbing, and video synthesis. It integrates multiple ASR, translation, and TTS engines, making it suitable for cross-language video production and localization.
@wsl8297: Want to turn ebooks or documents into audiobooks? Many tools sound too robotic or lack subtitle sync, leaving you frustrated. Then I found the open-source project Abogen: it supports ePub, PDF, plain text, etc., one-click conversion to high-quality audio with auto-generated synchronized subtitles. It uses Kokoro voice at its core…
Abogen is an open-source tool that can convert documents like ePub and PDF into high-quality audio with one click, automatically generating synchronized subtitles. It supports a voice mixer and multiple deployment methods.
@yhslgg: Old Yang shares another gem open-source tool—KrillinAI, 10,000 stars on GitHub, a must-see for multilingual audio/video content! In a nutshell: from video download to subtitle translation, AI dubbing, video compositing, the entire pipeline is covered, and it can even auto-generate platform covers, supporting Bilibili, Douyin, Xiaohongshu, YouTube…
KrillinAI is an open-source tool that integrates the entire workflow of video downloading, subtitle translation, AI dubbing, and video compositing. It supports context-aware translation, voice cloning, auto layout, and cover generation, and is compatible with multiple AI models, suitable for multilingual audio/video content creation and distribution.
@noahduck283: A tool that can download any YouTube video, cleanly remove vocals, transcribe, translate into 100+ languages, clone the original voice, and perform fully automatic dubbing. It takes less than 2 minutes. 100% runs locally. Free. Sews six top open-source models into a web page for "one-click download, vocal removal, transcription, translation, dubbing"...
Voice-Pro is a web tool that integrates six top open-source models (Whisper, Demucs, CosyVoice, F5-TTS, etc.), supporting YouTube video downloading, vocal removal, transcription, translation, voice cloning, and fully automatic dubbing. It takes less than 2 minutes, runs 100% locally, and is free.
@VincentLogic: Found an incredible open-source desktop AI tool from ByteDance! UI-TARS Desktop, with 31k stars, truly lives up to the hype. It can actually understand your screen and automate computer operations for you. Just tell it "Enable auto-save in VS Code and set the delay to 500ms", and it will automatically: -…
ByteDance's open-source desktop AI automation tool, UI-TARS Desktop, supports local execution and screen visual understanding. It can autonomously control your computer to handle daily tasks through natural language commands.