@jichiep: privacy-filter.cpp performance Vs the PyTorch implementation. Approx between 1.6x and 18x faster:
Summary
privacy-filter.cpp outperforms the PyTorch implementation by approximately 1.6x to 18x in performance.
View Cached Full Text
Cached at: 06/17/26, 01:43 AM
privacy-filter.cpp performance Vs the PyTorch implementation. Approx between 1.6x and 18x faster: https://t.co/U0I4npCQgc
Similar Articles
OpenAI Privacy Filter Model
OpenAI quietly released an Apache-2.0-licensed privacy-filter model on Hugging Face with open weights, aiming to help users run local privacy-preserving filters while retaining big-lab quality.
openai/privacy-filter
OpenAI releases Privacy Filter, a 1.5B parameter bidirectional token classification model for PII detection and masking, featuring an Apache 2.0 license and long-context support for high-throughput data sanitization.
Introducing OpenAI Privacy Filter
OpenAI releases Privacy Filter, an open-weight model designed to detect and redact personally identifiable information (PII) in text with high efficiency and context awareness.
Benchmark: ONNX Runtime vs HF Transformers vs GGUF for Parakeet TDT 0.6B on CPU-only hardware [D]
A benchmark comparing ONNX Runtime, HF Transformers, and GGUF for the Parakeet TDT 0.6B ASR model on CPU-only hardware shows ONNX Runtime achieves 37% faster inference than HF Transformers bfloat16, while GGUF prioritizes memory efficiency.
Anyone using Flash Attention 2 (ai-bond) on their V100's? How is the performance?
A user benchmarks a V100-compatible port of Flash Attention 2, reporting 3x-17x speedups and up to 94% memory reduction over default PyTorch attention.