@no_stp_on_snek: I built a pocket Charles Spurgeon. Ask the Prince of Preachers for counsel or hand him your own sermon draft and let hi…
Summary
A developer built a pocket Charles Spurgeon AI assistant that runs fully offline on a fine-tuned Gemma model. It can answer theological questions, prepare sermons, and grade sermon drafts in Spurgeon's voice.
View Cached Full Text
Cached at: 06/15/26, 09:07 PM
I built a pocket Charles Spurgeon.
Ask the Prince of Preachers for counsel or hand him your own sermon draft and let him grade it (he will gently tell you where the Cross went missing).
My first of two Build Small builds for the Build small hackathon.
Runs fully offline on a small, fine-tuned model. Code and how it works: https://github.com/TheTom/pastors-pocket-spurgeon…
@huggingface @Gradio @cohere #BuildSmall
TheTom/pastors-pocket-spurgeon
Source: https://github.com/TheTom/pastors-pocket-spurgeon
Pastor’s Pocket Spurgeon
A Victorian study companion that answers, prepares, and grades sermons in the voice and Reformed (Calvinist) theology of Charles Haddon Spurgeon, the “Prince of Preachers”, running fully offline on a single consumer GPU.
Built for the Build Small Hackathon (Backyard AI track).
| 🤖 Model | thetom-ai/Spurgeon-Gemma-4-12B-v1 (Q8_0 GGUF) |
| 🚀 Live demo | Space: build-small-hackathon/pastors-pocket-spurgeon |
| ⚡ Serving engine | TurboQuant llama.cpp fork (turbo4 KV compression) |
| 🎬 Demo video | see the Space |
| 📓 Field notes | how it was built and trained |
What it does
Three modes, all in Spurgeon’s voice and Reformed doctrine:
-
The Counsel — ask a pastoral or theological question, get a shepherd’s answer.
-
Sermon Prep — give a passage or topic, receive a Spurgeon-style outline.
-
Sermon Review — submit a sermon draft and Mr. Spurgeon grades it: a Summary, Strengths, Concerns, A Word of Exhortation, and a verdict on the Sword & Trowel scale:
Marks Tier 5 A Trumpet in Zion 4 Sound Timber, Well Hewn 3 A Lamp Half-Trimmed 2 A Skeleton Unclothed 1 A Cloud Without Rain
How it works
A lightweight Gradio app talks to any OpenAI-compatible endpoint (SPURGEON_ENDPOINT).
It uses BM25 retrieval over Spurgeon’s own sermons to surface real cited passages alongside
each answer, and falls back to canned answers if no model endpoint is configured (so the UI
always demos). The Sermon Review is built section-by-section with forced headers, and the
grade is derived deterministically from the model’s mark so the verdict format is always
exact.
Run it locally
1. Get the model (from Hugging Face):
huggingface-cli download thetom-ai/Spurgeon-Gemma-4-12B-v1 \
Spurgeon-Gemma-4-12B-v1-Q8_0.gguf --local-dir ./model
2. Serve it with TurboQuant KV compression. Build the TurboQuant llama.cpp fork, then:
# q8_0 keys + turbo4 values: TurboQuant compresses the V cache so long
# conversations fit on constrained hardware (up to ~2.5x more context vs fp16 KV).
# turbo4 requires flash attention (-fa on).
llama-server -m ./model/Spurgeon-Gemma-4-12B-v1-Q8_0.gguf \
-c 131072 -fa on -ctk q8_0 -ctv turbo4 --jinja \
--host 0.0.0.0 --port 8080 --alias spurgeon
(On stock llama.cpp without the fork, use -ctv q8_0 — you lose the extra compression but
everything else works.)
3. Run the app:
pip install -r requirements.txt
export SPURGEON_ENDPOINT=http://127.0.0.1:8080/v1
export SPURGEON_MODEL=spurgeon
python app.py # http://127.0.0.1:7860
Optional env: SPURGEON_TOKEN (bearer token if your endpoint is behind auth).
TurboQuant
TurboQuant is a llama.cpp fork that adds
compressed KV-cache types (turbo2/turbo3/turbo4). turbo4 stores the V cache at
~4.25 bits/value (≈3.8x smaller than fp16); paired with q8_0 keys it fits up to ~2.5x
longer conversations on the same GPU, so a 12B pastor runs comfortably offline on consumer
hardware. It requires flash attention and is validated for this model with q8_0 keys +
turbo4 values.
License
- Code in this repository: Apache 2.0.
- Model weights (
Spurgeon-Gemma-4-12B-v1): inherit the Gemma license from the base model. - Spurgeon’s sermons used for retrieval and fine-tuning are public domain.
Similar Articles
Built a local AI assistant because I always knew this day would come, yesterday just made it feel very real
A developer built Bantz, a fully local AI personal assistant running on Gemma 4b with a butler persona, integrating Gmail, Calendar, web search, system monitoring, and desktop control, emphasizing independence from cloud infrastructure.
@om_patel5: THIS GUY BUILT A FREE AI ASSISTANT THAT FLOATS ON YOUR MACOS DESKTOP AND RUNS COMPLETELY LOCALLY no API keys, no subscr…
A developer created a free, open-source AI assistant that floats on macOS desktop, runs entirely locally using models like Gemma and Qwen via Ollama, with no API keys or subscriptions, ensuring data privacy and offline capability.
@garrytan: https://x.com/garrytan/status/2053127519872614419
Garry Tan describes using a personal AI agent system, termed 'Book Mirror', to deeply integrate reading material with his life context via Meta-Meta-Prompting. He shares insights on building real AI systems as an operating system rather than just a chat interface.
@N3sOnline: Used @mattpocockuk /teach skill and gave it access to a chess engine and an API to fetch all of my recent games. It now…
A user demonstrates using a custom AI skill with a chess engine and game API to fetch recent games, analyze them, and generate personalized lessons with interactive visuals and puzzles.
Built a tool to save Claude responses (and ChatGPT, Gemini) into one searchable vault - sharing in case it's useful
Coffer is a browser extension that adds a save button to AI chatbot responses from Claude, ChatGPT, and Gemini, storing them locally in a searchable vault with formatted Markdown.