Tag
A reflective article questioning the casual assumption that building AI agents is easy, highlighting the complex components like APIs, RAG, tool calling, memory, and orchestration, and suggesting that simpler workflows often suffice before needing true agents.
The author questions whether many so-called AI agents are better described as workflows, arguing that for repeatable browser tasks, defined workflows may be more reliable than agents that reinterpret steps each time.
The author argues that the real danger of AI agents is not their errors but their ability to perform final actions autonomously, suggesting that agents should stop one step earlier and leave the final click to humans or narrow workflows.
IBM Research explores how agent logic—software primitives like knowledge graphs and program analysis—can guide LLM-based agents to efficiently handle complex enterprise workflows, reducing hallucinations and costs while improving outcomes.
A discussion about whether AI agents can reliably automate complex, multi-step workflows without constant human supervision, asking about current limitations and experiences.
Haydn Belfield discusses how tokenmaxxing experiments and token leaderboards serve an inspirational and exploratory purpose by testing AI model limits and discovering new workflows.
A discussion on where AI agents fail in real workflows, highlighting issues with coordination, reliability under messy inputs, and the challenge of reducing human intervention in production.
A detailed guide on using OpenAI's Codex as an operating system for knowledge work, including setup, workflows, and a seven-day starter plan, written using Codex itself.
OpenAI released a library of ready-to-use prompts for Codex, featuring project workflows and automations that can be adapted for other AI coding agents.
The article argues that while many are building and selling AI agents, the real value lies in the workflows and training that make them useful, not the underlying technology.
A personal take that AI agents feel genuinely futuristic because they can autonomously plan steps, use tools, and recover from errors, marking a shift from AI that only answers questions to AI that actually does things.
Claude Code is releasing a /workflows feature that turns standard business work into pseudo-deterministic workflows based on SOPs, which the author sees as a significant step for Enterprise AI.
Boris Cherny, creator of Claude Code, argues that single-agent workflows are obsolete and explains the future is teams of specialized agents working together.
The author describes Briefing Fox, a prompt engineering tool that now includes an Agentic AI & Workflow feature, allowing even first-time AI users to create custom AI agents in minutes. It is free and requires no signup.
Matt Pocock published his daily-used Claude Code skills for real engineering, with over 90,000 stars and MIT license.
Anthropic quietly shipped /workflows in Claude Code, replacing LLM orchestrators with code-based control flow to avoid token tax and context sloppiness.
Boris Cherny, creator of Claude Code at Anthropic, explains on a podcast why some users save 10 hours a week while others save none, highlighting hidden features like CLAUDE.md, task chaining, batch operations, and automations that most users overlook.
A security research reveals a technique named Megalodon for mass backdooring of GitHub repositories by exploiting CI workflows.
A tweet announces an Agent Mode version of a report for Every subscribers, containing the full essay, sources, references, and workflows, designed to be used with Codex or Claude Codex.
Pao announced the launch of Handinger, a managed cloud agent for automating business tasks such as email workflows, reporting, and data analysis.