Tag
A tweet announcing an open-source UI/UX for building organization-level agent harnesses, allowing users to bring their own model and runtime and integrate with their tools.
TileRT is a tile-based runtime achieving ultra-low-latency LLM inference, with recent milestones including 1000+ tokens/s on a 1-trillion-parameter model. It supports models like DeepSeek-V3.2 and GLM-5, and is available as open-source on GitHub.
The article explores the gap in operational tooling for AI agents in production, focusing on challenges like error handling, state replay, security, and approval workflows.
Kuma is a compiler/runtime that compiles exported PyTorch models into self-contained WebGPU executables, enabling direct browser inference without Python or server dependencies.
Deno 2.9 introduces `deno desktop` for building native desktop applications using web technologies, along with improved Node.js compatibility, CSS module imports, and faster startup.
Discusses using NVFP4 4-bit floating point weights for maximum performance, achieved via in-house quantization from FP8 using NVIDIA ModelOpt, highlighting the data format's dual scale factors for high dynamic range.
The author built a runtime control layer to address the problem of AI agents failing silently in production environments.
The author describes building a Minsky brain: a runtime of 40+ LLM agents wired in a connectome and staged phylogenetically to simulate a brain. They ask for advice on where to post this project on Reddit.
The article announces the launch of WASI 0.3, which integrates async primitives natively into WebAssembly components via the Component Model, simplifying APIs and enabling better component composition.
A deep dive into what happens before the main function in Rust binaries, exploring runtime initialization, entry points, and novel techniques for mutable data initialization.
This paper presents a five-plane reference architecture for runtime governance of production AI agents, addressing security risks from delegated actions. It defines primitives, invariants, and an evaluation framework to ensure safety and utility.
The project details a line-by-line translation of the OCaml runtime from C to Rust, aiming to improve safety and performance.
Introduces Spice, an open-source decision layer that acts as a 'brain' above execution agents like Claude Code and Codex, enabling context-aware task delegation and structured decision-making.
The author argues that AI agents in production should be defined as declarative manifests with their own runtime, rather than being scattered across application code, in order to enable proper versioning, observability, and rollback. They present their own solution as an open-source tool.
A reflective article critiquing Deno's shift towards Node.js compatibility, arguing it dilutes its original streamlined, zero-config philosophy that made it compelling for developers.
The author built Tidebase, an open-source runtime for agent workflows that provides checkpointing, retries, and live run state tracking using Postgres, enabling failed runs to resume from where they left off.
A detailed technical post about building AgentForge, an open-source agent harness in Python, covering components like session runtime, tool contracts, approval layers, and persistence, emphasizing that agents are defined by their runtime, not just the model.
Discusses common runtime issues in agentic workflow (loop budget, tool permissions, state loss due to compression), recommends DenisSergeevitch's agents-best-practices resource, provides a provider-neutral reference, emphasizes making permissions, budget, and observability explicit mechanisms.
Agent libOS introduces a library-OS-inspired runtime substrate for LLM agents, treating agents as schedulable processes with explicit capabilities, lifecycle management, audit records, and human approval queues. The design shifts the trust boundary from tool dispatch to runtime primitives, enabling long-running agents to be scheduled, authorized, resumed, and audited safely.
MARGIN is a runtime confidence calibration method for multi-agent foundation model systems that learns per-agent calibration factors online, improving pairwise resolution from below random to 70-89% on hard benchmarks, requiring no held-out data or retraining.