Ollama Model Tester (GitHub Repo)

TLDR AI Tools

Summary

A small, dependency-free Python CLI tool that runs the same prompt against your local Ollama models and saves every response to disk, making it easy to compare models side by side.

Ollama Model Tester is a CLI tool for comparing local Ollama models by running the same prompt multiple times and saving responses for easy comparison.
Original Article
View Cached Full Text

Cached at: 06/05/26, 02:06 PM

ulyssestenn/omt

Source: https://github.com/ulyssestenn/omt

Ollama Model Tester

A small, dependency-free CLI for running the same prompt against your local Ollama models and saving every response to disk — so you can compare models (or compare repeated runs of one model) side by side.

It uses only the Python standard library: no pip install required.

Requirements

  • Python 3.7 or newer
  • Ollama running locally (the default http://localhost:11434)
  • At least one model pulled, e.g. ollama pull llama3.1:8b

Quick start

Make sure Ollama is running, then:

python3 ollama_model_test.py

You’ll be asked, in order:

  1. Which model to use (pick a number from your installed models)
  2. The prompt — type as many lines as you like, then put /done on its own line to finish
  3. How many times to run the prompt
  4. Temperature (0.02.0), or press Enter to use Ollama’s default
  5. Whether to stream the responses live to the terminal

It then runs the prompt the requested number of times and writes the results under ollama-runs/.

Command-line flags (optional)

Every prompt above can be supplied up front, which makes the tool scriptable. Anything you omit is still asked interactively.

FlagDescription
--model NAMELocal model to use (must already be installed)
--runs NNumber of generations to run
--temperature TTemperature, 0.02.0
--prompt-file PATHRead the prompt from a UTF-8 text file
--stream / --no-streamStream responses live, or don’t

Example — run a saved prompt three times, fully non-interactive:

python3 ollama_model_test.py \
  --model llama3.1:8b \
  --prompt-file prompt.txt \
  --runs 3 \
  --temperature 0.7 \
  --no-stream

Output

Results are grouped into one folder per prompt:

ollama-runs/
  what-are-the-main-tradeoffs-between_835562a4/
    prompt.md         # the prompt, with its hash and timestamp
    metadata.json     # every run against this prompt (model, timing, options)
    llama3.1-8b.md    # responses + Ollama metadata for this model
    gemma3-1b.md

The folder name is the first few words of the prompt plus a short hash of the full prompt. Because the folder is keyed on the prompt, running the same prompt against a different model drops its output into the same folder — making model-to-model comparison easy. Each model’s file records every run’s response alongside Ollama’s run metadata (token counts, timings, and so on).

Similar Articles