Tag
Supertone released Supertonic 3, an open-source TTS model with 99M parameters that runs faster on CPU than a 2B model on A100, supporting 31 languages and ONNX Runtime for fully local inference.
TinySearch is an open-source MCP tool that performs web search, crawls pages, and retrieves/re-ranks relevant chunks to provide a concise context for smaller local LLMs, avoiding context pollution from full page dumps.
MiniMind-O has released an end-to-end omnimodal model with only 0.1B parameters, supporting text, speech, and image inputs as well as streaming speech output. The project opensources the code, weights, training data, and technical report, emphasizing that both training and inference can be performed quickly on standard GPUs.
This paper introduces LiVeAction, a lightweight neural codec designed for real-time operation on resource-constrained devices. It utilizes an FFT-like structure and variance-based rate penalty to achieve superior rate-distortion performance while remaining practical for low-power sensors.
dots.ocr is a new lightweight 1.7B parameter multilingual vision-language model that achieves state-of-the-art performance on OmniDocBench, outperforming much larger models (72B+) at document parsing and OCR tasks.
MOSS-TTS-Nano is an open-source multilingual speech generation model with only 0.1B parameters, designed for real-time TTS that runs directly on CPU without GPU. Released by OpenMOSS team and MOSI.AI, it enables simple local deployment for web serving and product integration.