vectorization

#vectorization

How we made WINDOW JOIN parallel and vectorized

Hacker News Top ↗ · 3d ago Cached

QuestDB introduces a dedicated WINDOW JOIN operator that is parallelized and vectorized, achieving up to 25x speedup over alternative databases for time-series aggregations around event timestamps.

0 favorites 0 likes

#vectorization

A Tiny Compiler for Data-Parallel Kernels

Hacker News Top ↗ · 4d ago Cached

A blog post describing a tiny compiler that demonstrates how to lower data-parallel kernels by converting for loops into vectorized loops with lanes and masks, implemented in ~180 lines of Python.

0 favorites 0 likes

#vectorization

PivCo-Huffman “merge” operations

Lobsters Hottest ↗ · 2026-06-22 Cached

This blog post analyzes the PivCo-Huffman paper, which introduces 'merge' operations for parallel Huffman decoding, enabling efficient vectorized and GPU-friendly decoding without interleaving overhead.

0 favorites 0 likes

#vectorization

Accelerating NeurASP with vectorization and caching

arXiv cs.AI ↗ · 2026-06-10 Cached

This paper accelerates the NeurASP neurosymbolic AI framework by implementing vectorization, batch processing, and caching, achieving multiple orders of magnitude speedup on larger tasks.

0 favorites 0 likes

#vectorization

Accelerating std::copy_if using SIMD

Lobsters Hottest ↗ · 2026-05-26 Cached

Blog post analyzing and implementing a SIMD-accelerated version of std::copy_if using AVX-512 instructions on AMD Zen 4, with performance analysis and comparisons to compiler auto-vectorization.

0 favorites 0 likes

#vectorization

The fastest way to match characters on ARM processors?

Lobsters Hottest ↗ · 2026-04-19 Cached

This article explores the fastest methods for matching characters on ARM processors using SIMD instructions, comparing traditional NEON approaches with newer SVE2 capabilities available on modern ARM chips like AWS Graviton4, Google Axion, and others.

0 favorites 0 likes

vectorization

How we made WINDOW JOIN parallel and vectorized

A Tiny Compiler for Data-Parallel Kernels

PivCo-Huffman “merge” operations

Accelerating NeurASP with vectorization and caching

Accelerating std::copy_if using SIMD

The fastest way to match characters on ARM processors?

Submit Feedback