Tag
This paper introduces MMSkills, a framework for representing, generating, and using multimodal procedural knowledge for visual agents, combining textual procedures with visual state cards and keyframes, and demonstrates improvements in GUI and game-based visual agent benchmarks.
The author discusses the frustration of reading Markdown in the terminal and describes using Claude to quickly build a custom macOS Markdown viewer (MDV.app), illustrating how AI enables rapid creation of personal software tools.
Derpy Turtle is a Windows GUI tool designed to enhance Kokoro voice outputs by integrating voice search, RVC model training, and post-generation voice conversion into a unified workflow.
Inflorescence is a cross-platform native GUI for the Pijul version control system, built with Rust and the iced framework, inspired by Magit and designed for keyboard-driven navigation with async responsiveness.
H2O LLM Studio is an open-source framework and no-code GUI that simplifies the fine-tuning of large language models, supporting techniques like LoRA, DPO, and integration with Hugging Face.
Tencent collaborated to provide evaluations that improved OpenClaw’s harness performance and is helping upstream fixes to the open-source repo.
Developer excited to add a GUI to their self-built llm-wiki project.
T3 Code is a minimal web GUI for coding agents supporting Codex and Claude, available as a desktop app and CLI tool across multiple platforms.