Tag
The author built TinyHarness, a low-memory-footprint AI harness compatible with Ollama, Llama.cpp, and vllm, aiming to compete with tools like pi and opencode.
A developer created a free, open-source AI assistant that floats on macOS desktop, runs entirely locally using models like Gemma and Qwen via Ollama, with no API keys or subscriptions, ensuring data privacy and offline capability.
Critical security vulnerabilities in Ollama, including a memory leak exploit dubbed 'Bleeding Llama' and a Windows RCE flaw, have been disclosed, prompting urgent upgrades for users.
Oh My PPT is a locally running AI slideshow generation tool that supports automatic presentation creation from documents or topics, with compatibility for offline operation via Ollama.
The article argues that vLLM has overtaken Ollama in usability due to the rapid pace of new model releases, finding it more practical than alternatives like DeepSpeed or TensorRT.
A guide on running local AI models like Qwen 3.5-9B on an M4 MacBook with 24GB RAM using tools like LM Studio, Ollama, and pi, including specific configuration tips for optimal performance.
Recommending the open-source project awesome-llm-apps, which catalogs 100+ AI Agent and RAG applications, with the latest merge featuring a browser automation MCP agent based on local Ollama.
This article is a guide for local large model deployment, covering hardware selection, memory calculations, Runtime tool comparisons, and model quantization options, helping users from getting started to optimizing their local inference experience.
The author shares a locally runnable AI companion built with Python, Gemini, and Ollama, featuring a custom cognitive architecture based on Global Workspace Theory and an Integrated Information Theory proxy for personality modeling.
A developer achieved 9/10 pass rate on real Go tasks using a routed local setup built around Qwen3.6 35B and the little-coder scaffold, showing strong local performance when paired with the right tooling.
Developer Ivan Fioravanti demonstrates running Andrej Karpathy's autoresearch project locally with a 6-bit quantized Gemma-4-26B model on Apple Silicon, suggesting successful training of Gemma 4 E2B IT variant.
A community discussion post seeking advice on which Mac Mini configuration (M4, M2 Pro, or M1 Max) to purchase for running local LLMs with Ollama and coding assistants, with the decision complicated by rumored M5 releases and current supply shortages.
Developer shares experience building a local-first knowledge base using MCPs, Strapi, TanStack, and Ollama with Gemma 4, noting easy switch to frontier models like Claude.
A developer argues that llama.cpp deserves first-class support in OSS AI coding tools, criticizing the ecosystem's preference for Ollama and calling for more flexible, endpoint-agnostic integrations.