Tag
This article explores the true meaning of Forward Deployed Engineering (FDE) in AI deployment, emphasizing that FDE is not simply about API calls or building agents, but rather a systematic engineering approach geared toward production deployment, including business translation, system design, platform integration, production operations, and capability accumulation.
A thread explaining six essential AI concepts (tokens, embeddings, vector search, etc.) for building production-ready AI systems, emphasizing that understanding them prevents costly failures like runaway API costs.
APEX proposes a three-layer self-evolution framework for production AI agents that simultaneously optimizes the harness, behavioural principles, and workflow topology. Experiments on a production agent show significant improvements in health score and workflow quality with minimal LLM calls.
The author discusses the need for a fourth governance loop in self-improving AI agent systems to prevent objective drift, proposing periodic human review, withheld benchmarks, and rotating evaluators as practical controls.
Discusses practical challenges in combining long-term memory and knowledge base for personalized AI agents, questioning whether stuffing everything into context is standard and exploring alternatives like separate retrieval pipelines or tool-based interaction.
Discusses the problem of memory drift in AI systems where preferences and facts become outdated but are only appended, leading to conflicting versions and unreliable retrieval.
RAMP is a production-grounded evaluation framework for LLM agents that exposes significant capability degradation invisible to static benchmarks, showing task completion rates collapsing from 100% to 20% across serial workflows. The framework assesses 15 mainstream models on realistic compiler-construction workloads with complex toolchain interactions and staged recovery mechanisms.
The article argues that the difference between impressive and useless AI often lies not in the model itself but in the surrounding workflow—context, memory, tool access, and orchestration. It suggests that workflow architecture may become a more significant competitive advantage than raw model capability.
1Password shares lessons from using AI agents to analyze and refactor their large Go monolith, detailing successes in deterministic tooling and challenges in applying agents to live production changes.
The article shares decks from the Knowledge Graph Conference (KGC) 2026, highlighting a trend of enterprises deploying production-grade knowledge graphs for reasoning and governance rather than just vector retrieval.
A detailed breakdown of a 9-layer production AI architecture covering RAG pipeline, agents, prompts, security, evaluation, and observability layers.
The article discusses how AI agent workflows are shifting optimization focus from pure inference costs to broader challenges like latency, orchestration overhead, and reliability. It highlights a trend toward hybrid architectures and dynamic model routing to address these multi-step workflow complexities.
A developer seeking recommendations on advanced AI workflow orchestration tools and patterns, including LangChain, LangGraph, and AWS Step Functions, to build more robust and future-proof systems.
Analysis of a recurring failure pattern in production AI systems where technically correct decisions become contextually wrong as underlying assumptions shift, framed as the 'Formalisation Trap' where meaning gets locked into outdated structures.