acceleration

#acceleration

OTCache: Optimal Transport for Geometry-Aware Caching in Diffusion Models

arXiv cs.LG ↗ · 1h ago Cached

OTCache is a training-free framework that uses optimal transport to predict caching schedules for diffusion models, achieving up to 4.7x acceleration on FLUX.1, Qwen-Image, and HunyuanVideo while improving generation fidelity.

0 favorites 0 likes

#acceleration

BlockPilot: Instance-Adaptive Policy Learning for Diffusion-based Speculative Decoding

Hugging Face Daily Papers ↗ · yesterday Cached

BlockPilot proposes an instance-adaptive policy that predicts the optimal block size for diffusion-based speculative decoding, achieving significant speedup with minimal overhead.

0 favorites 0 likes

#acceleration

@Sumanth_077: Fine-tuning massive LLMs used to be painfully slow, but not anymore! 4 open source libraries that accelerate fine-tunin…

X AI KOLs Timeline ↗ · 2d ago Cached

A tweet highlighting four open-source libraries (Unsloth, LLaMA Factory, DeepSpeed, Axolotl) that accelerate fine-tuning of large language models with memory and speed optimizations.

0 favorites 0 likes

#acceleration

ResilPhase: Plug-and-Play Phase Mapping and Noise-Resilient Macro-Trajectory Extrapolation for Diffusion Acceleration

arXiv cs.AI ↗ · 5d ago Cached

ResilPhase is a training-free acceleration framework for diffusion models that reformulates accelerated inference as stable macro-trajectory extrapolation in ODE space, using derivative-free barycentric Lagrange extrapolation and bounded phase mapping to achieve state-of-the-art fidelity under high acceleration ratios.

0 favorites 0 likes

#acceleration

@songhan_mit: We develop an agent-native approach to accelerate genAI, continuing the success of KDA (Kernel Design Agent) at a highe…

X AI KOLs Following ↗ · 6d ago Cached

Enze Xie announces Sol Video Inference Engine, an agent-native, training-free full-stack accelerator for video diffusion that auto-tunes cache, sparse attention, token pruning, quantization, and kernel fusion, achieving >2× end-to-end speedup on large models like 64B Cosmos3-Super and 22B LTX-2.3.

0 favorites 0 likes

#acceleration

@eladgil: Feels like the AI world is hitting a new era. Every 6 months is a big step going forward Vibe (written 20 years ago)-

X AI KOLs Timeline ↗ · 2026-06-20 Cached

Elad Gil reflects on the accelerating pace of AI progress, linking to a review of Charles Stross's sci-fi novel Accelerando, which explores singularity themes.

0 favorites 0 likes

#acceleration

eCNNTO: A Highly Generalizable ConvNet for Accelerating Topology Optimization

arXiv cs.AI ↗ · 2026-06-20 Cached

This paper proposes eCNNTO, a CNN with residual connections to accelerate density-based topology optimization by predicting near-optimal densities from early iteration histories, achieving up to 97% reduction in iterations and strong generalization across different boundary conditions, geometries, and mesh resolutions.

0 favorites 0 likes

#acceleration

AdaPLD: Adaptive Retrieval and Reuse for Efficient Model-Free Speculative Decoding

arXiv cs.CL ↗ · 2026-06-05 Cached

AdaPLD is a training-free method that improves model-free speculative decoding by using adaptive retrieval combining lexical and semantic similarity, and constructing branched reuse hypotheses to handle continuation uncertainty, achieving up to 3.10x decoding speedup.

0 favorites 0 likes

#acceleration

TAPS: Target-Aware Prefix Tree Selection for Diffusion-Drafted Speculative Decoding

arXiv cs.AI ↗ · 2026-06-02 Cached

TAPS proposes a target-aware prefix tree selection method for diffusion-drafted speculative decoding, achieving up to 7.9x lossless end-to-end speedup by improving the acceptance-cost tradeoff over prior methods.

0 favorites 0 likes

#acceleration

Timeline of AI models since GPT-2. Model releases are accelerating over time.

Reddit r/ArtificialInteligence ↗ · 2026-06-01

An article chronicling the timeline of AI model releases since GPT-2, highlighting the accelerating pace of model launches over time.

0 favorites 0 likes

#acceleration

How truth will change faster than ever because we learn from what learns from us. 2030

Reddit r/ArtificialInteligence ↗ · 2026-06-01

This article argues that AI creates a fast feedback loop where humans and machines mutually shape truth, accelerating consensus shifts and making truth increasingly synthetic and detached from reality.

0 favorites 0 likes

#acceleration

Speculative Pipeline Decoding: Higher-Accruacy and Zero-Bubble Speculation via Pipeline Parallelism

arXiv cs.CL ↗ · 2026-06-01 Cached

This paper proposes Speculative Pipeline Decoding (SPD), a framework that uses pipeline parallelism within a single LLM to enable parallel token speculation, avoiding the latency bubbles and accuracy degradation of multi-token prediction in traditional speculative decoding.

0 favorites 0 likes

#acceleration

@gdb: AI for accelerating research, by expanding what mathematicians and scientists dare attempt:

X AI KOLs Following ↗ · 2026-05-30 Cached

Greg Brockman highlights how AI gives researchers like mathematician Terence Tao the freedom to explore bolder, more creative ideas in their work.

0 favorites 0 likes

#acceleration

RT-Lynx: Putting the GEMM Sparsity In a Right Way for Diffusion Models

Hugging Face Daily Papers ↗ · 2026-05-26 Cached

RT-Lynx proposes using activation sparsity instead of weight sparsity to accelerate diffusion models, achieving up to 1.55× linear-layer speedup while maintaining generation quality, and is accepted at ICML 2026.

0 favorites 0 likes

#acceleration

Earth is now heating up twice as fast as in previous decades

Hacker News Top ↗ · 2026-05-21 Cached

Global warming has accelerated to twice the rate of previous decades, with a 98% confidence that the acceleration is due to climate change. If warming continues at this pace, the 1.5°C Paris Agreement limit could be breached by 2028.

0 favorites 0 likes

#acceleration

@sama: three of the things we are most excited about: 1. AGI accelerating research 2. AGI accelerating companies 3. personal A…

X AI KOLs ↗ · 2026-05-20 Cached

Sam Altman shares three areas of excitement for AGI: accelerating research, companies, and personal goals. He also notes recent announcements including a unit distance result and $2M in OpenAI credits for Y Combinator startups.

0 favorites 0 likes

#acceleration

CATS: Cascaded Adaptive Tree Speculation for Memory-Limited LLM Inference Acceleration

arXiv cs.LG ↗ · 2026-05-13 Cached

This paper introduces CATS, a cascaded adaptive tree speculation framework designed to accelerate LLM inference on memory-constrained edge devices by optimizing memory usage while maintaining high token acceptance rates.

0 favorites 0 likes

#acceleration

PARD-2: Target-Aligned Parallel Draft Model for Dual-Mode Speculative Decoding

arXiv cs.CL ↗ · 2026-05-12 Cached

This paper introduces PARD-2, a dual-mode speculative decoding framework that uses target-aligned parallel draft models to accelerate LLM inference, achieving up to 6.94x lossless acceleration on Llama 3.1-8B.

0 favorites 0 likes

#acceleration

DARE: Diffusion Language Model Activation Reuse for Efficient Inference

arXiv cs.LG ↗ · 2026-05-12 Cached

This paper introduces DARE, a method for improving the inference efficiency of Diffusion Large Language Models by reusing cached key-value and output activations to reduce computational redundancy with negligible quality loss.

0 favorites 0 likes

#acceleration

SpecBlock: Block-Iterative Speculative Decoding with Dynamic Tree Drafting

arXiv cs.CL ↗ · 2026-05-11 Cached

This paper introduces SpecBlock, a block-iterative speculative decoding method that combines path dependence with efficient drafting to accelerate LLM inference. It demonstrates improved speedup over existing methods like EAGLE-3 while maintaining lower drafting costs.

0 favorites 0 likes

acceleration

Submit Feedback