triton-kernels

#triton-kernels

@raphaelsrty: Computing max similarity (scoring step of colbert, colpali) on gpus can be optimized and this is what @tonywu_71 did. I…

X AI KOLs Following ↗ · 2d ago Cached

Tony Wu released late-interaction-kernels (LIK): fused Triton kernels for MaxSim, the scoring step behind ColBERT and ColPali, integrated into PyLate and colpali-engine, offering memory efficiency and performance gains.

0 favorites 0 likes

#triton-kernels

Wall Attention (GitHub Repo)

TLDR AI ↗ · 2026-06-03 Cached

Wall Attention is a new attention variant with per-channel, per-timestep multiplicative decay, providing content-dependent forgetting rates and efficient training/decode kernels implemented in Triton.

0 favorites 0 likes

#triton-kernels

@akshay_pachaar: PyTorch Autograd vs. Unsloth Triton Kernels. The core engineering behind UnslothAI has always been impressive! Instead …

X AI KOLs Following ↗ · 2026-04-20 Cached

Technical explanation comparing PyTorch's default autograd with UnslothAI's custom backpropagation kernels written in OpenAI's Triton language for faster LLM fine-tuning.

0 favorites 0 likes

triton-kernels

@raphaelsrty: Computing max similarity (scoring step of colbert, colpali) on gpus can be optimized and this is what @tonywu_71 did. I…

Wall Attention (GitHub Repo)

@akshay_pachaar: PyTorch Autograd vs. Unsloth Triton Kernels. The core engineering behind UnslothAI has always been impressive! Instead …

Submit Feedback