I made a Windows app for managing llama.cpp in WSL/Ubuntu

Reddit r/LocalLLaMA 05/26/26, 06:34 PM Tools

windows wsl llama-cpp desktop-app gpu-management model-download open-source

Summary

llama.cpp Console is a Windows desktop app that provides a GUI for managing llama.cpp in WSL/Ubuntu, handling installation, building, model downloading, and serving.

I’m a Windows user, and I have fairly Windows-y expectations for software: I prefer not having to live in a terminal just to install, build, configure, and run things. I couldn’t find an app that managed the full llama.cpp-on-WSL workflow the way I wanted, so I made one. llama.cpp Console is an unofficial Windows desktop app for setting up and running llama.cpp models through Ubuntu/WSL. The Windows app itself is a self-contained WPF app, and it helps manage the WSL side from the UI. **GitHub:** [https://github.com/alekk89/llama.cpp-Console](https://github.com/alekk89/llama.cpp-Console) **What it can do from the UI:** \- Detect/install WSL and guide Ubuntu setup \- Install/update CPU build tools inside Ubuntu \- Install/update CUDA Toolkit support inside WSL \- Install/update Vulkan build dependencies \- Download llama.cpp source from the official repo or a custom repo \- Build CPU, CUDA, or Vulkan llama.cpp runtimes inside WSL \- Search Hugging Face for GGUF models \- Download/register models, including some compatibility hints and companion projector/mmproj handling \- Set launch parameters per model \- Choose which llama.cpp runtime/build each model should use \- Start, stop, and supervise llama-server \- Monitor live tokens, runtime metrics, logs, GPU status, utilization, and temperatures \- Track logs, jobs, downloads, and lifetime metrics \- Manage local OpenCode model/provider/agent config snippets from the app, so a configured model can be added to OpenCode quickly The main reason I built it is that I wanted the boring setup work to feel more like normal Windows software - click through the UI, see what is installed, see what is missing, build the runtime, download a model, pick launch settings, and run it without losing full control of what's going on. **A few notes:** \- This is a Windows-first app. The actual llama.cpp runtime runs in Ubuntu/WSL. \- Model serving defaults to local-only. \- Right now the app is centered around one active served model at a time. \- The first public release is unsigned, so Windows SmartScreen may warn. SHA-256 files are included with the release artifacts. \- This is not affiliated with or endorsed by llama.cpp or ggml-org. I’ve been using a simpler version of this locally for a while, then polished it up enough to release in case it’s useful to other Windows users. Planned future work includes faster model switching, keeping models warm in RAM where practical, and eventually supporting more than one loaded model at a time. Please note that I do not own AMD GPUs, so the Vulkan installation/build path has not been validated on AMD hardware by me.

Original Article

I made a Windows app for managing llama.cpp in WSL/Ubuntu

Similar Articles

LlamaStation v0.9 — llama.cpp GUI for Windows with multi-backend support, TurboQuant, MTP and more

Here's a llama.cpp CLI Command builder.

llama.cpp is the linux of llm

Made a simple template manager and GUI for llama.cpp so I don't have to keep memorizing CLI flags.

PWA Support has been merged

Submit Feedback

Similar Articles

LlamaStation v0.9 — llama.cpp GUI for Windows with multi-backend support, TurboQuant, MTP and more

Here's a llama.cpp CLI Command builder.

Made a simple template manager and GUI for llama.cpp so I don't have to keep memorizing CLI flags.