Tag
Gemma 4 12b models are now available on Ollama, offering various quantized versions for local AI inference.
Built a Tauri v2 desktop chat shell for local LLMs that can connect to Ollama, llama.cpp, or any OpenAI-compatible endpoint. The project is MIT licensed and produces a ~12 MB binary.
A tweet thread curating 10 free resources to learn AI engineering, from Harvard's CS50 AI course to Karpathy's neural networks tutorial, fast.ai, Hugging Face courses, and local tools like Ollama, providing a structured path from beginner to employable skills.
A developer replaced Claude with Qwen3.6-27B in a multi-agent orchestrator for two weeks, finding it viable as a reasoning layer but unreliable for execution due to a 12% tool-call error rate and long-context drift.
A lightweight Python framework for local LLM roleplay using Ollama and Phi-3, featuring context preservation and native streaming to prevent character drift.
A developer built a local autonomous coding agent using Ollama, combining a fine-tuned personality model (Eve) for conversation and MiniMax M3 for heavy lifting, achieving a 40-round agentic loop with 16 tools and 9/9 tests passing first try.
Eve Agent V2 Unleashed is an open-source autonomous coding agent that runs locally via Ollama, featuring a 40-round tool loop, 112 sub-agents, and optional cloud escalation. It can plan, write, test, and verify code with zero hand-holding, with quick start in under 5 minutes.
Una demostración de cómo usar el flag -ncmoe en llama.cpp aumenta significativamente la velocidad de inferencia de Qwen3.6 en GPUs de consumo, logrando 70 tok/s en una RTX 4070 12GB frente a los 21 tok/s de Ollama.
Atomic Memory is a tool that upgrades Ollama's memory system with per-turn updates, semantic recall, conflict detection, and cheap GPU usage, addressing limitations of Hermes' built-in memory. It uses a small dedicated model to provide efficient and unbounded memory management for individual or team use.
A lawyer built a $50 portable offline AI device using a Raspberry Pi 5, Ollama, and Phi-3-mini, selling it for $999 to law firms needing secure document analysis. This guide explains how to replicate the setup step-by-step.
Anubis OSS, an Apple Silicon Mac app for benchmarking local LLMs, now supports direct model downloads from the UI via a 'Browse Models' button that pulls from ollama.com library. The developer is seeking testers to confirm installation and functionality.
LiquidAI releases a GGUF quantized version of their LFM2.5-8B-A1B model, with instructions for use across multiple inference engines.
A developer built a local web GUI for the TradingAgents multi-agent LLM stock analysis framework, supporting various LLM providers and adding features like live pipeline visualization, a report reader, and multi-session chat.
free-claude-code is an open-source tool that proxies Claude Code's traffic to more than 10 free or local backends such as DeepSeek, Kimi, Ollama, etc., enabling permanent free usage of Claude Code. Supports CLI, VSCode, JetBrains, and Discord.
A user shares a tip to use Ollama's local llama3.1:8b model for compressing conversation context in agent workflows, reducing latency and token usage compared to sending context to providers.
The author presents hollow-agentOS, a Dockerized open-source multi-agent system that runs locally and enables agents to autonomously write Python tools, uses a 'suffering score' to prevent infinite logic loops, and employs consensus-driven governance for code modifications.
Local AI coding agent using Hermes orchestrator with Kanban workflow, combining SmallCode (Ollama on Mac mini) for small tasks and Claude/Codex for heavy work, all tracked in a single Kanban board.
swm is an open-source tool that simplifies cloud GPU usage by installing frameworks like ComfyUI and Ollama in one command, and automatically saves your entire workspace between sessions, enabling seamless migration across providers.
flux-genotype is an open-source AI kernel that orchestrates local LLMs on CPU, allowing self-modification of its architecture via a MetaDesigner module.
Ollama now natively supports Codex, allowing you to run the OpenAI Codex App entirely free and locally without subscriptions, API keys, or data leaving your laptop.