runtime-optimization

#runtime-optimization

RAMPART: Registry-based Agentic Memory with Priority-Aware Runtime Transformation

arXiv cs.CL ↗ · 2026-06-04 Cached

RAMPART is a compile-time memory model and in-RAM block registry for LLM-based agents that uses five composable primitives to manage context assembly with priority-aware ordering and eviction. Experiments across multiple 7-14B models show that block grouping, relevance gating, and schema eviction significantly improve task success rates and reduce prompt token costs.

0 favorites 0 likes

#runtime-optimization

@_akhaliq: GPU Forecasters Language Models as Selective Surrogates for Kernel Runtime Optimization

X AI KOLs Following ↗ · 2026-06-02 Cached

This paper proposes using language models as selective surrogates to optimize GPU kernel runtime, demonstrating a novel approach to performance forecasting.

0 favorites 0 likes

#runtime-optimization

SkillSmith: Compiling Agent Skills into Boundary-Guided Runtime Interfaces

arXiv cs.AI ↗ · 2026-05-18 Cached

SkillSmith is a boundary-first compiler-runtime framework that extracts fine-grained operational boundaries from LLM agent skills, enabling agents to dynamically access only relevant components, reducing solve-stage token usage by 57.44% and thinking iterations by 42.99% on the SkillsBench benchmark.

0 favorites 0 likes

runtime-optimization

RAMPART: Registry-based Agentic Memory with Priority-Aware Runtime Transformation

@_akhaliq: GPU Forecasters Language Models as Selective Surrogates for Kernel Runtime Optimization

SkillSmith: Compiling Agent Skills into Boundary-Guided Runtime Interfaces

Submit Feedback