TextGen, formerly text-generation-webui, has been updated to a native, no-install desktop application for Windows, Linux, and macOS, offering enhanced privacy, ik_llama.cpp support, and native tool-calling capabilities as an open-source alternative to LM Studio.
Hi all, I have been making a lot of updates to my project, and I wanted to share them here. TextGen (previously text-generation-webui, also known as my username oobabooga or ooba) has been in development since December 2022, before LLaMa and llama.cpp existed. In the last two months, the project has evolved from a web UI to a **no-install desktop app** for Windows, Linux, and macOS with a polished UI. I have created a very minimal and elegant Electron integration for that. (Did you know LM Studio is also a web UI running over Electron? Not sure many people know that.) https://preview.redd.it/tk8oibhgjw0h1.png?width=1686&format=png&auto=webp&s=95c70f769766466885c8fdc6e7211525a371a920 It works like this: 1. You download a *portable build* from the [releases page](https://github.com/oobabooga/textgen/releases) 2. Unzip it 3. Double-click textgen 4. A window appears There is no installation, and no files are ever created outside the extracted folder. It's fully self-contained. All your chat histories and settings are stored in a `user_data` folder shipped with the build. There are builds for CUDA, Vulkan, CPU-only, Mac (Apple Silicon and Intel), and ROCm. Some differentiating features: * Full privacy. Unlike LM Studio, it doesn't phone home on every launch with your OS, CPU architecture, app version, and inference backend choices. Zero outbound requests. * ik\_llama.cpp builds (LM Studio and Ollama only ship vanilla llama.cpp). ik\_llama.cpp has new quant types like IQ4\_KS and IQ5\_KS with SOTA quantization accuracy. * Built-in web search via the `ddgs` Python library, either through tool-calling with the built-in `web_search` tool (works flawlessly with Qwen 3.6 and Gemma 4), or through an "Activate web search" checkbox that fetches search results as text attachments. * Tool-calling support through 3 options: single-file .py tools (very easy to create your own custom functions), HTTP MCP servers, and stdio MCP servers. You can enable confirmations so that each tool call shows up with approve/reject buttons before it executes. I have written a guide [here](https://github.com/oobabooga/textgen/wiki/Tool-Calling-Tutorial). * The ability to create custom characters for casual chats, in addition to regular instruction-following conversations: https://preview.redd.it/anlkyz6ijw0h1.png?width=1686&format=png&auto=webp&s=e8783773865c8c0721bd1474d583fd96604c3d38 * OpenAI and Anthropic compliant API with very strict spec compliance. **It works with Claude Code**: you can load a model and run `ANTHROPIC_BASE_URL=http://127.0.0.1:5000 claude` and it will work. * Accurate PDF text extraction using the `PyMuPDF` Python library. * `trafilatura` for web page fetching, which strips navigation and boilerplate from pages, saving a lot of tokens on agentic tool loops. * Chat templates are rendered through Python's Jinja2 library, which works for templates where llama.cpp's C++ reimplementation of jinja sometimes crashes. I write this as a passion project/hobby. It's free and open source (AGPLv3) as always: [https://github.com/oobabooga/textgen](https://github.com/oobabooga/textgen)
The author introduces TextWeb, an open-source tool that renders web pages as markdown for LLMs instead of using expensive vision models, featuring CLI and MCP server support.
Open WebUI Desktop launches as a native app letting users run local LLMs or connect to remote servers without Docker or terminal setup, featuring offline operation, system-wide voice input, and floating chat overlay.
The author introduces Vellium, an open-source cross-platform desktop application for interacting with LLMs, featuring new desktop widgets and a visual interface for AI agents that support MCP servers and file manipulation.
OpenAI is launching GPT-4o, its newest flagship model with improved text, voice, and vision capabilities, and making it available to free ChatGPT users with usage limits. They are also releasing a new macOS desktop app and redesigned UI for ChatGPT.
Google expands Gemini 2.0 Flash native image generation capabilities to all developers, enabling multimodal text and image output for storytelling, conversational image editing, and applications requiring world understanding and text rendering.