SI Units for Request Rate (2024)
Summary
An article discussing the proper use of SI units for measuring request rate in distributed systems, proposing the use of hertz (Hz) for periodic/regular traffic and becquerel (Bq) for stochastic/organic traffic patterns to standardize how request rates are communicated.
View Cached Full Text
Cached at: 04/20/26, 02:55 PM
Similar Articles
RateQuant: Optimal Mixed-Precision KV Cache Quantization via Rate-Distortion Theory
This paper introduces RateQuant, a method for optimal mixed-precision KV cache quantization that uses rate-distortion theory to address distortion model mismatch. It significantly reduces perplexity compared to existing methods like KIVI and QuaRot with minimal calibration overhead.
UniSVQ: 2-bit Unified Scalar-Vector Quantization
UniSVQ proposes a unified 2-bit quantization framework that bridges scalar and vector quantization by parameterizing codewords as an affine transform of integer lattices, achieving state-of-the-art performance among scalar methods and matching vector methods with higher throughput.
Qwen3.6-27B Quantization Benchmark
This article benchmarks various Qwen3.6-27B quantizations (Q8 to Q2) using KLD and Same Top P metrics, comparing providers like Unsloth and mradermacher, and offers recommendations for quality-size trade-offs.
Elucidating the SNR-t Bias of Diffusion Probabilistic Models
This paper identifies a Signal-to-Noise Ratio timestep (SNR-t) bias in diffusion probabilistic models during inference, where SNR-timestep alignment from training is disrupted at inference time. The authors propose a differential correction method that decomposes samples into frequency components and corrects each separately, improving generation quality across models like IDDPM, ADM, DDIM, EDM, and FLUX with minimal computational overhead.
We should get rid of average CPU utilization
The article explains why average CPU utilization is a misleading metric for latency-sensitive workloads, using queueing theory and a real-world production incident. It argues for more nuanced monitoring approaches.