Articles from X
Natolambert announces a new lecture covering synthetic data and the history of distillation, from Hinton 2015 to modern on-policy distillation, with over 7 hours of video content.
This post explains how to create an automated feedback loop for AI agents to iteratively improve their skills, using computer use and an observer skill to evaluate and update the skill code.
PP-OCRv6 is a lightweight OCR model (34.5M parameters) that challenges large VLMs with its MetaFormer architecture, offering efficient text detection and recognition across multiple deployment scenarios.
Microsoft's NextLat introduces a training objective that rewards belief-state representations instead of relying solely on next-token prediction, pushing models toward compact world models for better generalization.
A developer demonstrates running Gemma 4 26B MoE model locally on an 8GB RTX 4060 with Hermes agent to fully automate backtesting of trading strategies, highlighting the growing capability of local LLMs as autonomous agents.
A developer shares surprising lessons from fine-tuning a small open model, including that base models often already max out on intended improvements, the real weakness is behavior (caving), and fine-tuning requires careful measurement and balancing.
Discussion on how loops in AI agents can amplify both good and bad behaviors, emphasizing the need for an engaged human in the loop to guide the agent's learning of user preferences.
Crabbox is a new tool that gives AI coding agents isolated cloud environments to test and verify PRs, enabling them to work in parallel without conflicts and reducing the review bottleneck.
Tencent launches Agently Mail for QQ Mail, a dedicated email service tailored for AI Agents.
Chad Jones announced he will be on leave from Stanford starting June 30 to join the Anthropic Institute, where he will continue research on AI and economic futures.
The author built a fully offline AI agent using local embedding models, Llama via Ollama, and VectorAI DB to address the risks of cloud-dependent AI. The agent runs on an 8GB MacBook, processes sensitive documents, and maintains memory across sessions.
Cross Repo Review is a tool that maps inter-repo dependencies and surfaces downstream impacts, breaking changes, and blast radius on PRs, tracking code, service, data, and pipeline dependencies.
QodoAI released Cross Repo Review, an AI code review tool that can detect bugs across multiple repositories, going beyond single-repo analysis to catch issues caused by changes in one repo that break others.
Introducing the Perspective MCP, a tool that enables Claude to build conversion funnels with tracking, CRM, and auto optimization, built on experience from $1B in ad spend.
AI has automated the coding process, shifting the software developer's role from writing code to specifying and verifying systems, effectively returning the focus to product development.
This post shows how to serve Baidu's Unlimited-OCR model as a temporary, OpenAI-compatible endpoint on Hugging Face Jobs, enabling multi-page document parsing with features like table-to-HTML and equation-to-LaTeX extraction.
A beginner-friendly tutorial on how to set up persistent memory for an AI Agent in 30 minutes, using the open-source EverOS tool to store memory as editable Markdown files, without requiring Docker or vector database clusters.
A thread explaining the 5 core mental shifts needed to transition from traditional software engineering to agent engineering, emphasizing why conventional patterns like hard-coded routes and binary tests fail with AI agents.
A live demonstration of an AI agent training a coding agent from a single prompt, with all artifacts recapped.
A novel parser architecture developed by Shanshrew achieves 2-3x speedup over the current fastest JS/TS parser, and is being integrated into Oxc.