optimization

#optimization

I didn't know it was possible to compile llamacpp to run cuda + vulkan at the same time..

Reddit r/LocalLLaMA ↗ · 2026-06-16

The author discovered that compiling llama.cpp with both CUDA and Vulkan backends simultaneously is possible, yielding a ~10% improvement in tokens/sec for decoding. They plan to run further benchmarks to assess the benefits.

0 favorites 0 likes

#optimization

Making ast.walk 220x Faster

Hacker News Top ↗ · 2026-06-16 Cached

The Reflex team optimized Python's ast.walk by 220x for their AI code generation linter by removing generator overhead, inlining functions, and implementing a Rust binding.

0 favorites 0 likes

#optimization

@umichkim: AI for Science is moving from “writing text” to “writing and testing scientific code.” A new Nature paper introduces ER…

X AI KOLs Timeline ↗ · 2026-06-16 Cached

A new Nature paper introduces ERA, an AI system that iteratively writes, runs, scores, and improves scientific code through tree search, moving AI for science from text generation to code testing.

0 favorites 0 likes

#optimization

The time the x86 emulator team found code so bad that they fixed it during emulation

Lobsters Hottest ↗ · 2026-06-16 Cached

A story from a Windows x86 emulator team about encountering a program with a fully unrolled 64KB initialization loop (65,536 instructions) and adding a special optimization to replace it with a tight loop.

0 favorites 0 likes

#optimization

Large Language Models as Optimizers: A Survey of Direct vs. Tool-Augmented Approaches and Their Performance Frontiers

arXiv cs.AI ↗ · 2026-06-16 Cached

This survey categorizes LLM-based optimization into three paradigms—direct, tool-augmented, and tool-creating—and reviews their performance frontiers and limitations.

0 favorites 0 likes

#optimization

Spokes: Optimizing for Diverse Pretraining Data Selection

arXiv cs.CL ↗ · 2026-06-16 Cached

This paper introduces Spokes, a probabilistic diversification framework using the G-Vendi score to optimize diversity in pretraining data selection, achieving significant improvements in downstream task performance on FineWeb and DCLM by jointly optimizing quality and diversity.

0 favorites 0 likes

#optimization

When to use what Schatten-$p$ norm in deep learning?

arXiv cs.LG ↗ · 2026-06-16 Cached

This paper provides guidance on the appropriate use of different Schatten-p norms in deep learning, analyzing their theoretical properties and practical implications for model regularization and optimization.

0 favorites 0 likes

#optimization

Zero-order Parameter-free Optimization for LMO-based Methods: Novel Approach for Efficient Fine-tuning

arXiv cs.LG ↗ · 2026-06-16 Cached

This paper introduces AdaNAGED, a method that combines zero-order optimization, parameter-free adaptation, and non-Euclidean update geometry for memory-efficient fine-tuning of large language models, with theoretical convergence guarantees and validation on the OPT-1.3B model.

0 favorites 0 likes

#optimization

{\alpha}-Fair Insurance Pricing: A Fairness Continuum

arXiv cs.LG ↗ · 2026-06-16 Cached

This paper proposes an α-Fair Individual Solvent Premium (α-FISP) framework for insurance pricing that balances actuarial fairness and solidarity fairness while ensuring solvency, using constrained optimization to yield a continuum of pricing solutions.

0 favorites 0 likes

#optimization

DFlash and Spec V2 Decoding (14 minute read)

TLDR AI ↗ · 2026-06-16 Cached

Z Lab, SGLang, and Modal release DFlash, a new speculative decoding model for Qwen 3.5 397B-A17B that uses block diffusion and KV injection to achieve over 4x throughput improvement over baseline and 1.5x over native MTP.

0 favorites 0 likes

#optimization

@songhan_mit: Explore our continued efforts on KV cache compression:

X AI KOLs Following ↗ · 2026-06-15 Cached

A tweet from Song Han highlights continued work on KV cache compression, featuring a blog by Weian Mao that discusses system-level aspects often overlooked in papers.

0 favorites 0 likes

#optimization

This is amazing. Token speed doubled + kv cache now need low vram - qwen 27b

Reddit r/LocalLLaMA ↗ · 2026-06-15

A new KV cache optimization called kvflash doubles generation speed and reduces VRAM usage for Qwen 3.6-27B on a single RTX 3090 while maintaining accuracy.

0 favorites 0 likes

#optimization

Clojure is almost as fast as C (with some help)

Lobsters Hottest ↗ · 2026-06-15 Cached

This article details how Clojure, with the JVM's Vector API and careful optimization, achieved frame rates within 20% of C for a 3D stress test, demonstrating that a dynamic language can approach low-level performance on hot loops.

0 favorites 0 likes

#optimization

A Deep Reinforcement Learning (DRL)-Based Transformer Method for Solving the Open Shop Scheduling Problem

arXiv cs.AI ↗ · 2026-06-15 Cached

Presents a Transformer-based scheduling policy trained with reinforcement learning for the open shop scheduling problem, showing that a model trained on small instances can generalize to much larger problems and compete with classical dispatching heuristics.

0 favorites 0 likes

#optimization

FedSPC: Shared Parameter Correction for Personalized Federated Learning

arXiv cs.LG ↗ · 2026-06-15 Cached

Proposes FedSPC, a modular correction method for personalized federated learning that applies control-variate correction only to shared parameters, improving performance across various PFL methods on CIFAR-100 and Tiny-ImageNet.

0 favorites 0 likes

#optimization

High-Frequency Pricing at Scale for E-Commerce

arXiv cs.LG ↗ · 2026-06-15 Cached

This paper presents a forecast-then-optimize algorithmic pricing tool for fashion e-commerce sales campaigns, using gradient-boosted trees for daily-demand forecasting and multi-objective optimization. A/B tests across 12 markets show the system achieves 6% higher profit while maintaining sales and revenue, and it has been deployed at Zalando.

0 favorites 0 likes

#optimization

Simplifying Weak Reference Processing in ZGC

Lobsters Hottest ↗ · 2026-06-14 Cached

This master's thesis at Uppsala University, done in collaboration with Oracle, investigates reducing the overhead of weak reference processing in the ZGC garbage collector by proposing three pipeline modifications and an alternative annotated-field mechanism.

0 favorites 0 likes

#optimization

@RitOnchain: Jane Street pays $750K/year for quants who master matrix calculations holistically that can be used to get alpha from s…

X AI KOLs Timeline ↗ · 2026-06-13

A free 57-minute resource by MIT's Applied Math team covers matrix calculations and automatic differentiation for quants and optimization, highlighting Jane Street's high compensation for such skills.

0 favorites 0 likes

#optimization

@GitTrend0x: Must-have plugins for Hermes before takeoff: Orange Book Chinese Practical Guide, Optimization Guide Full Process Manual, Hermes HUD Visual Brain, Scarf Native macOS GUI, Open Design Local Design Skill Pack… Programmers across the internet have turned Hermes into the next-generation Agent …

X AI KOLs Timeline ↗ · 2026-06-12 Cached

Summarizes multiple community plugins and resources around the Hermes Agent framework, including Chinese practical guides, optimization manuals, visual monitoring tools, native macOS GUI, and design skill packs, helping users from beginner to advanced optimization.

0 favorites 0 likes

#optimization

@RisingSayak: Published my first kernel to go the last mile to optimize LTX-2.3 from @Lightricks! torch.compile + cuDNN attn already …

X AI KOLs Following ↗ · 2026-06-12 Cached

Published a custom kernel to further optimize LTX-2.3 from Lightricks, achieving 1.52x speedup on GB10, building upon previous torch.compile and cuDNN attention optimizations.

0 favorites 0 likes

optimization

Submit Feedback