llm-evals

#llm-evals

@alphabatcher: 540,000 lines of Rails is a brutal way to learn the agent era Garry's List shipped with: > 262,000 lines of app code > …

X AI KOLs Following ↗ · 3d ago Cached

A critique of large Rails codebases in the AI agent era, proposing a shift to skill-based development with agents, markdown skills, and TypeScript for deterministic I/O.

0 favorites 0 likes

#llm-evals

Better Experiments with LLM Evals — A funnel, not a fork (6 minute read)

TLDR AI ↗ · 2026-05-21 Cached

Spotify Engineering discusses using LLM evals as a funnel before A/B experiments, improving hit rates and creating a feedback loop between evals and experiments.

0 favorites 0 likes

llm-evals

@alphabatcher: 540,000 lines of Rails is a brutal way to learn the agent era Garry's List shipped with: > 262,000 lines of app code > …

Better Experiments with LLM Evals — A funnel, not a fork (6 minute read)

Submit Feedback