open-source-llm

Tag

Cards List
#open-source-llm

@MaximeRivest: Tool calling in open source LLMs is wildly different from one model to another. I just wipped up: http://chattemplatepl…

X AI KOLs Following · 2026-06-03 Cached

A new web tool, Chat Template Playground, lets users visualize how different open-source LLMs render their chat templates, highlighting differences in prompting and tokenization.

0 favorites 0 likes
#open-source-llm

Split my agent into a cheap router model and a premium synthesis model, bill dropped about 75%

Reddit r/AI_Agents · 2026-05-19

A developer splits their AI agent's LLM calls into a cheap router model (GPT-OSS 120B) for tool-picking and a premium model (gpt-5.4) for synthesis, cutting costs by ~78% while maintaining output quality.

0 favorites 0 likes
#open-source-llm

Drastically improve prompt processing speed for --n-cpu-moe partially offloaded models

Reddit r/LocalLLaMA · 2026-05-12

The article shares a performance optimization trick for llama.cpp, showing that increasing the micro-batch size (`-ub`) combined with partial CPU offloading (`--n-cpu-moe`) can drastically improve prompt processing speed for large models like gpt-oss-120b on consumer GPUs.

0 favorites 0 likes
← Back to home

Submit Feedback