simd

Tag

Cards List
#simd

PivCo-Huffman

Lobsters Hottest · yesterday Cached

This paper presents PivCo-Huffman, a new approach to Huffman coding using pivot coding from wavelet trees, enabling high-performance SIMD-friendly encoding and decoding. It consistently outperforms state-of-the-art Huffman codecs and shows how ANS coding can be selectively applied to skewed nodes to approach ANS compression ratios while preserving high decompression speeds.

0 favorites 0 likes
#simd

Accelerating std::copy_if using SIMD

Lobsters Hottest · 2026-05-26 Cached

Blog post analyzing and implementing a SIMD-accelerated version of std::copy_if using AVX-512 instructions on AMD Zen 4, with performance analysis and comparisons to compiler auto-vectorization.

0 favorites 0 likes
#simd

What it takes to transpose a matrix

Hacker News Top · 2026-05-24 Cached

An in-depth technical blog post explaining how to efficiently transpose matrices using SIMD instructions on modern x86_64 CPUs, focusing on AVX2 intrinsics like _mm256_shuffle_epi8.

0 favorites 0 likes
#simd

Bun.Image

Hacker News Top · 2026-05-23 Cached

Bun.Image is a zero-dependency chainable image pipeline for decoding, resizing, rotating, and re-encoding JPEG, PNG, WebP, HEIC, and AVIF, running off-thread and inspired by Sharp.

0 favorites 0 likes
#simd

minc — A minimal language for building native software

Lobsters Hottest · 2026-05-22 Cached

minc is a minimal programming language that compiles directly to native executables for multiple platforms without external tooling. It features modern syntax, built-in SIMD, and a bundled shader compiler.

0 favorites 0 likes
#simd

C++26 Shipped a SIMD Library Nobody Asked For

Lobsters Hottest · 2026-05-14 Cached

The article criticizes the new std::simd library in C++26, arguing it is slower than scalar loops, compiles slowly, and is outperformed by auto-vectorizers and alternative libraries like Google Highway, questioning its value after a decade-long standardization process.

0 favorites 0 likes
#simd

@techwith_ram: A 10M document corpus eats 31 GB of RAM as float32 Most teams hit that wall & reach for a managed vector database. $400…

X AI KOLs Timeline · 2026-05-14

turbovec is an open-source Rust vector index using Google Research's TurboQuant algorithm, achieving 16x compression and faster search than FAISS, with integrations for RAG frameworks like LangChain, LlamaIndex, and Haystack.

0 favorites 0 likes
#simd

Making cross-platform SIMD code pleasant

Lobsters Hottest · 2026-05-11 Cached

The author details the third iteration of the bx library's cross-platform SIMD abstraction, advocating for a typeless approach and SSA-style coding to simplify low-level performance optimization across different CPU architectures.

0 favorites 0 likes
#simd

A cache-friendly IPv6 LPM with AVX-512 (linearized B+-tree, real BGP benchmarks)

Hacker News Top · 2026-04-20 Cached

planb-lpm is a portable, MIT-licensed C++17 library implementing efficient IPv6 longest-prefix-match (LPM) using a linearized B+-tree with AVX-512 SIMD, featuring dynamic FIB support, Python bindings, and comprehensive benchmarking against real BGP data.

0 favorites 0 likes
#simd

The fastest way to match characters on ARM processors?

Lobsters Hottest · 2026-04-19 Cached

This article explores the fastest methods for matching characters on ARM processors using SIMD instructions, comparing traditional NEON approaches with newer SVE2 capabilities available on modern ARM chips like AWS Graviton4, Google Axion, and others.

0 favorites 0 likes
#simd

https://www.youtube.com/watch?v=aE0onltJlOo

YouTube AI Channels · 2026-05-21 Cached

This lecture introduces the flexible evolution of GPU architecture as a SIMD (vector/array) processor, discusses data parallelism, memory bank grouping, bank conflicts, serial bottlenecks, and the history of SIMD instructions (such as MMX), emphasizing how GPUs leverage data parallelism and deal with serial bottlenecks.

0 favorites 0 likes
← Back to home

Submit Feedback