Tag
Another open-source tool on GitHub, Shimmy, is a single 5MB file written in Rust that provides fast and stable local inference with a full OpenAI-compatible API, targeting Ollama's pain points. It starts in under 100ms and uses about 50MB of memory.