Tag
Agyn is an open-source, Kubernetes-native agent runtime that brings AI agents like Claude Code and Codex into production with full credential isolation and pre-built harnesses. It addresses security concerns by running MCP servers in sidecars and using mTLS for internal services, preventing prompt injection credential leaks.
This paper presents Inductive Deductive Synthesis (IDS), an LLM-based agentic system that jointly synthesizes implementation and formal proof for distributed systems, achieving 7/7 specifications at roughly 200x faster than expert effort and 17% cheaper than state-of-the-art agents.
This article introduces Highest Random Weight (HRW) hashing as a stateless alternative to ExHashRing for consistent hashing in Elixir, discussing its simplicity, performance trade-offs, and providing code examples.
White Rabbit is an open-source technology that provides sub-nanosecond synchronization and reliable data transfer for large distributed systems, connecting thousands of nodes over Ethernet.
This 1985 paper introduces the concept of virtual time for discrete event simulation and distributed concurrency control, proposing the Time Warp mechanism.
Google has open-sourced AX (Agent eXecutor), a distributed runtime framework for coordinating agentic loops with built-in recovery and resumption, designed for Kubernetes.
Ursula is an open-source, self-hosted distributed server for replayable, append-only event timelines that runs over HTTP and SSE, using a thread-per-core, multi-Raft architecture with S3 storage for low latency and durability.
Two skills for AI coding agents that design and run claim-driven tests for distributed and stateful systems, producing structured test plans and findings reports with 9-state verdicts and blame classification.
A developer shares learnings from building a 100K-line Rust-based multi-Paxos consensus engine using AI coding agents, achieving dramatic productivity gains and performance improvements.
Explains the Calvin protocol, which uses deterministic locking to achieve distributed ACID transactions without requiring 2PC, improving scalability and reducing contention compared to traditional approaches.
This paper proposes a semantic feature segmentation framework for predictive maintenance that decomposes monitoring signals into canonical and residual components to improve interpretability while maintaining predictive performance.
MinT is a managed infrastructure system that enables efficient training and serving of millions of LLMs by keeping base models resident and moving lightweight LoRA adapters, scaling across model architectures, storage, and policy management.
Thinking Machines Lab is hiring supercomputing engineers in NYC and SF to build infrastructure for real-time interactive models and large-scale training.
This article describes a global email-based book club for senior developers focused on reading technical books about databases, distributed systems, and software performance, currently featuring 'Operating Systems: Three Easy Pieces'.
A GitHub repository containing comprehensive system design interview notes based on Alex Xu's bestselling books, covering topics like scaling, consistent hashing, and distributed systems.
This paper explores collaborative intelligence paradigms where distributed Large Language Models work together across devices and clouds to handle resource constraints. It covers vertical device-cloud collaboration, horizontal multi-agent collaboration, routing policies, and open research challenges in scalable and trustworthy cooperative AI.
Modular published a blog post explaining why traditional HTTP routing doesn't work for LLM inference workloads. The article describes how their distributed inference framework handles stateful, heterogeneous GPU pods with KV caches, specialized prefill/decode backends, and conversation-level routing that traditional stateless routing algorithms cannot address.
Researchers from the Specula team created SysMoBench, a benchmark evaluating whether LLMs can faithfully model real-world computing systems in TLA+ or merely recite textbook specifications. The benchmark tests 11 systems across four phases and reveals systematic gaps in current LLMs' ability to accurately model system implementations versus reference papers.
The article discusses the complexities of implementing idempotency in APIs, arguing that handling edge cases like concurrent requests and content mismatches is harder than simple replay caching.
Martin Kleppmann discusses how the fundamentals of building large, distributed systems have evolved over the past decade in light of the updated second edition of his book "Designing Data-Intensive Applications."