Tag
The article distinguishes between token minimization and context discipline in AI usage, highlighting that efficient prompt optimization is not the same as maintaining proper context awareness.
This blog post details three recent optimizations to LLVM's BumpPtrAllocator, reducing fast-path overhead by removing redundant alignment, null pointer checks, and per-allocation accounting, resulting in improved performance for Clang, lld, and other LLVM components.
This paper introduces a difference-of-convex programming framework in Wasserstein space for optimizing non-convex functionals over probability measures, with explicit decompositions for Maximum Mean Discrepancy and Energy Distance, and proves convergence of the lifted convex-concave procedure.
This paper introduces COOPA, a modular LLM agent architecture for operations research problems that combines iterative confidence-based modeling, element-level provenance, and multi-solver routing. Evaluated across eight LLM backbones and four baselines, COOPA achieves the best macro-average accuracy on six backbones and improves over the strongest baseline by up to 6.7 percentage points.
huff12 is a 12-stream Huffman decoder optimized for Apple Silicon processors, aiming to improve decoding performance through parallel stream processing.
A set of lecture notes covering the mathematics of neural networks, from basic activation functions to geometric concepts like group convolutions and equivariance.
The paper introduces RiVER, a reinforcement learning method that improves LLMs' coding performance on problems without known gold solutions by ranking programs on hidden test cases and providing graded feedback.
A blog post discussing optimization techniques for constrained categorical probability distributions, using softmax reparameterization and log barrier methods, applied to protein binder design.
A user details their setup running Qwen 27B with llama.cpp on an RTX PRO 6000 Blackwell for local coding agents, compares performance to Claude models, and asks for help resolving frequent crashes and malformed response issues.
This paper derives a scaling law for sketched linear contrastive learning under a Gaussian latent-variable model, analyzing how risk decomposes into approximation, optimization, and statistical terms, and provides theoretical guidance for balancing model size, data, and compute in contrastive learning.
This paper presents CASOP, a framework for context-aware synthesis and evaluation of optimization pipelines for warehouse order fulfillment, enabling automatic construction of valid algorithmic pipelines from a modular repository.
This paper provides optimal high-probability bounds for stochastic gradient descent under Markovian noise for PL-smooth objectives, closing gaps between expectation and high-probability guarantees and extending to heavy-tailed settings with matching lower bounds.
This paper proposes an agentic aggregator framework for coordinating electric bus fleet operations, integrating optimization-based scheduling with supervisory AI agents to handle disturbances, tariff adaptation, and value allocation, revealing trade-offs between operational efficiency and profit-oriented pricing.
BunnyxStudio spent 3 weeks removing SwiftData, resulting in a significant improvement in Hive's startup speed. A library of 66,000 images is almost instantly usable without waiting.
LFM2.5 230M model achieves 1,400 tokens per second in-browser using custom WebGPU kernels, demonstrating efficient local inference.
This article discusses how traditional primary key designs can isolate tables, and introduces structured primary keys as an alternative approach to improve SQL query performance and maintain relational integrity.
Describes a technique to improve AI agent speed by moving stable context out of the prompt, reducing token usage and latency.
The article discusses how LLM code style choices affect token consumption and costs, offering optimizations such as using Web API standards and simpler indentation to reduce output tokens.
This paper presents Agentic-LTPO, a nested bilevel optimization framework that uses agentic AI to adapt physical layer configurations under dynamic operator policies, achieving 57.2% long-term performance improvement in cell-free MIMO beamforming.
This article revisits techniques for creating extremely small ELF executables on Linux, exploring how to reduce size to 45 bytes by abusing header fields and overlapping structures while maintaining ELF specification conformance.