Ollama Model Tester (GitHub Repo)
Summary
A small, dependency-free Python CLI tool that runs the same prompt against your local Ollama models and saves every response to disk, making it easy to compare models side by side.
View Cached Full Text
Cached at: 06/05/26, 02:06 PM
ulyssestenn/omt
Source: https://github.com/ulyssestenn/omt
Ollama Model Tester
A small, dependency-free CLI for running the same prompt against your local Ollama models and saving every response to disk — so you can compare models (or compare repeated runs of one model) side by side.
It uses only the Python standard library: no pip install required.
Requirements
- Python 3.7 or newer
- Ollama running locally (the default
http://localhost:11434) - At least one model pulled, e.g.
ollama pull llama3.1:8b
Quick start
Make sure Ollama is running, then:
python3 ollama_model_test.py
You’ll be asked, in order:
- Which model to use (pick a number from your installed models)
- The prompt — type as many lines as you like, then put
/doneon its own line to finish - How many times to run the prompt
- Temperature (
0.0–2.0), or press Enter to use Ollama’s default - Whether to stream the responses live to the terminal
It then runs the prompt the requested number of times and writes the results
under ollama-runs/.
Command-line flags (optional)
Every prompt above can be supplied up front, which makes the tool scriptable. Anything you omit is still asked interactively.
| Flag | Description |
|---|---|
--model NAME | Local model to use (must already be installed) |
--runs N | Number of generations to run |
--temperature T | Temperature, 0.0–2.0 |
--prompt-file PATH | Read the prompt from a UTF-8 text file |
--stream / --no-stream | Stream responses live, or don’t |
Example — run a saved prompt three times, fully non-interactive:
python3 ollama_model_test.py \
--model llama3.1:8b \
--prompt-file prompt.txt \
--runs 3 \
--temperature 0.7 \
--no-stream
Output
Results are grouped into one folder per prompt:
ollama-runs/
what-are-the-main-tradeoffs-between_835562a4/
prompt.md # the prompt, with its hash and timestamp
metadata.json # every run against this prompt (model, timing, options)
llama3.1-8b.md # responses + Ollama metadata for this model
gemma3-1b.md
The folder name is the first few words of the prompt plus a short hash of the full prompt. Because the folder is keyed on the prompt, running the same prompt against a different model drops its output into the same folder — making model-to-model comparison easy. Each model’s file records every run’s response alongside Ollama’s run metadata (token counts, timings, and so on).
Similar Articles
Added direct model downloads right from the UI in Anubis OSS - if anyone would help test that would be great
Anubis OSS, an Apple Silicon Mac app for benchmarking local LLMs, now supports direct model downloads from the UI via a 'Browse Models' button that pulls from ollama.com library. The developer is seeking testers to confirm installation and functionality.
@NousResearch: Ollama now supports Hermes Desktop Run: 'ollama launch hermes-desktop'
Ollama now supports Hermes Desktop, allowing users to run the model with a single command. Hermes generates Python skills from natural language and improves with use.
I built a local autonomous coding agent with Ollama — fine-tuned soul model, 40-round agentic loop, MiniMax M3 for the heavy lifting
A developer built a local autonomous coding agent using Ollama, combining a fine-tuned personality model (Eve) for conversation and MiniMax M3 for heavy lifting, achieving a 40-round agentic loop with 16 tools and 9/9 tests passing first try.
I made a small local model (llama3.2 3B) reliably extract structured JSON from documents - the hard part wasn't the model, it was everything around it
A developer shares lessons from building a local document-to-JSON extractor using llama3.2 3B on Ollama, highlighting that deterministic post-processing and schema-constrained outputs matter more than model size, while seeking feedback on hallucination and context truncation issues with long documents.
Built a Tauri v2 desktop chat shell for local LLMs — point it at Ollama / llama.cpp / any OpenAI-compatible endpoint, MIT, ~12 MB binary
Built a Tauri v2 desktop chat shell for local LLMs that can connect to Ollama, llama.cpp, or any OpenAI-compatible endpoint. The project is MIT licensed and produces a ~12 MB binary.