Articles from Blog
Technical commentary from Luke Curley discussing how WebRTC's design prioritizes low latency by aggressively dropping audio packets, which conflicts with LLM voice applications where prompt accuracy matters more than speed. He recounts challenges faced at Discord implementing retransmission within browser constraints.
Simon Willison discusses the effectiveness of using HTML instead of Markdown as AI output format, highlighting benefits like SVG diagrams, interactive widgets, and rich explanations. Includes examples from Thariq Shihipar on Anthropic's Claude Code team and practical prompts for GPT-5.5.
Andrew Ng argues that fears of an AI-driven jobpocalypse are overblown, citing strong hiring in software engineering and historical patterns of technology creating more jobs than it destroys.
CyberSecQwen-4B is a small, specialized 4B parameter model fine-tuned for defensive cybersecurity tasks, designed to run locally on a single GPU, addressing privacy, cost, and air-gapped deployment needs.
Allen AI releases EMO, a mixture-of-experts model where modular structure emerges naturally from data, enabling use of just 12.5% of experts for a task while maintaining near full-model performance.
Google launches The Small Brief, enlisting four ad industry icons to create studio-quality campaigns for small businesses using its AI creative studio Flow, demonstrating AI's storytelling power.
OpenAI details how it deploys Codex with safety controls including sandboxing, approval policies, network policies, and agent-native telemetry to ensure secure operation of coding agents in enterprise environments.
A tutorial and project demonstrating LoRA fine-tuning of Qwen3-1.7B on AMD MI300X using ROCm for clinical question answering, providing a CUDA-free alternative for medical AI development.
Perplexity released its Personal Computer feature for Mac users through the desktop app, granting AI agents access to local files, applications, connectors, and the web.
Google DeepMind has taken a minority stake in EVE Online's developer (now Fenris Creations) to use the game as a testbed for AI models, studying intelligence in complex, dynamic systems without affecting live players.
TLDR is hiring a Senior Software Engineer for its Applied AI team, offering $250k-$350k and fully remote work, focusing on making processes legible to code and composable into workflows.
The author reflects on a visit to China's AI labs, comparing cultural differences between Chinese and American labs in building LLMs. Chinese labs benefit from a culture of collective work and student involvement, while American labs face challenges from individual ego and career ambitions.
This article argues that AI intelligence is becoming commoditized, similar to compute and storage, and that the most valuable companies will not be model builders but those who own customer relationships, proprietary data, and workflows.
Ramp presents a case study on using reinforcement learning post-training to build Fast Ask, a specialized spreadsheet retrieval agent that improves accuracy and reduces latency compared to general-purpose models.
Meta's In-Kernel Broadcast Optimization (IKBO) eliminates redundant user-embedding broadcast in RecSys inference via kernel-model-system co-design, delivering up to 2/3 latency reduction and ~4x speedup on H100 GPUs, and serving as the backbone for the Meta Adaptive Ranking Model.
The article discusses the importance of quality control for reinforcement learning data, outlining the shortcomings of current data vendors and the evaluation criteria used by frontier AI labs for RL data.
Codex CLI v0.128.0 introduces /goal, a feature for persisted goals that survives terminal restarts and multi-hour pauses, enabling automatic runtime continuation without re-prompting. The author recounts a six-hour session that persisted through a five-hour laptop closure, demonstrating the feature's reliability.
GitHub improved token efficiency in their agentic workflows by logging token usage via an API proxy and building daily optimization workflows, reducing overhead from unused MCP tool registrations.
Meta is preparing its Hatch AI agent, a consumer-grade autonomous agent with social media integration, expected to roll out behind a waitlist. The agent will handle image/video generation, shopping, research, and scheduled tasks, leveraging Instagram and Facebook.
llm-gemini 0.31 is a new release of the plugin for using Google's Gemini models with the LLM command-line tool.