@MaxForAI: You'd be hard-pressed to find a better eval resource library. If you're interested in eval, these are what you should read. Thanks to @xdotli for sharing.
Summary
Share a curated AI evaluation (evals) resource library, including high-quality blogs, podcasts, papers, and projects, compiled by Xiangyi Li.
View Cached Full Text
Cached at: 06/24/26, 08:29 PM
It’s hard to find a better eval resource library than this one.
If you’re interested in eval, these are what you should read.
Thanks to @xdotli for sharing
Xiangyi Li (@xdotli): sharing my personal library on evals 1/n
i put together the highest quality blogs, podcasts, papers, and projects on evals. additions are welcome!
The Unsloth team is terrifying.
They took China’s top open-source model, GLM 5.2, and optimized it using an extreme technique called 1-bit compression, then converted it into a lightweight GGUF format.
This means you can run GLM 5.2 entirely locally (on a 256GB Mac Studio) with no internet connection or external server needed, and at an impressive speed of 21 tokens per second (human speech is about 10-20 tokens per second).
And they didn’t stop there — they also livestreamed a showdown, pitting this compressed local model against the world’s most powerful and expensive paid cloud models: Claude 4.8 Opus and GPT-5.5.
What shocked developers in the comments the most was that this local model actually traded blows with multi-billion-dollar server clusters, delivering intelligent and precise answers that rivaled these closed-source giants.
The real winner today isn’t a specific model — it’s the concept of local inference:
From now on, your data stays 100% safe on your device
Your API bill is zero dollars, with intelligence on par with the best US companies
What a crazy but revolutionary approach.
Similar Articles
@xdotli: sharing my personal library on evals 1/n i put together the highest quality blogs, podcasts, papers, and projects on ev…
A Twitter thread sharing a curated personal library of high-quality blogs, podcasts, papers, and projects on AI evaluations (evals), inviting additions.
@pauliusztin_: Every day, 100+ people ask me, "How can I learn AI evals?" I copy-paste these 11 links (every time): 1. AI evals & obse…
A curated list of 11 links shared daily to help people learn AI evaluation techniques, covering evals, observability, LLM-as-judge, and agent evaluation.
@10xmylife: 10 Design Reference Websites to Help Improve Your AI's Aesthetic
10 design reference websites to help AI enhance its aesthetic sense for AI-generated content.
@jinchenma_ai: The best compilation of high-quality AI sources on the web, save it now! You can throw this article to Codex + Obsidian, let AI compile an index directory, then later when you want AI to search for quality information, you can let it search according to this directory.
Recommend an article that compiles the best high-quality AI sources on the web, and suggest using Codex and Obsidian to compile an index directory for future AI search of quality information.
@PierceZhang34: Sharing an open collaborative repository focused on AI-assisted research: Awesome Vibe Research. The core goal is to collect and curate reusable, verifiable, and evolvable AI-assisted components across the full research workflow (from idea generation to paper publication and dissemination), including: Agents, Skills...
Shared an open collaborative repository Awesome Vibe Research maintained by ModelScope. This repository collects and curates reusable, verifiable, and evolvable AI-assisted components across the full research workflow, including agents, skills, workflows, tools, and best practices. It aims to help researchers and developers leverage AI to improve research efficiency.