CUDA-oxide: Nvidia's official Rust to CUDA compiler
Summary
CUDA-oxide is an experimental Rust-to-CUDA compiler developed by NVIDIA that enables writing safe GPU kernels in idiomatic Rust, compiling directly to PTX without requiring domain-specific languages or foreign bindings.
View Cached Full Text
Cached at: 05/11/26, 06:56 PM
Similar Articles
cuda-oxide: cuda-oxide is an experimental Rust-to-CUDA compiler
cuda-oxide is an experimental Rust-to-CUDA compiler backend released by NVIDIA, enabling pure Rust GPU kernel development without foreign language bindings.
The cuda-oxide Book
cuda-oxide is an experimental Rust-to-CUDA compiler that allows developers to write safe, idiomatic Rust GPU kernels that compile directly to PTX.
@npashi: Finally able to talk about what I've been heads-down on for 6 months at @nvidia We just open-sourced cuda-oxide — an ex…
NVIDIA has open-sourced cuda-oxide, an experimental rustc backend that allows developers to write CUDA kernels directly in pure Rust without DSLs, FFI, or source-to-source translation.
@QingQ77: Pure Rust LLM inference engine with custom CUDA kernels for each hardware × model × quantization combination, achieving higher inference speed than vLLM and TensorRT-LLM. https://github.com/Avarok-Cybersecurity/a…
Atlas is a pure Rust LLM inference engine that delivers faster inference than vLLM and TensorRT-LLM by customizing CUDA kernels for each hardware × model × quantization combination.
A hackable compiler to generate efficient fused GPU kernels for AI models [P]
The author presents a custom, hackable ML compiler written in Python that lowers LLMs to optimized CUDA kernels through a multi-stage IR pipeline, achieving performance competitive with or superior to PyTorch on specific operations. The article details the compiler's optimization passes, lowering rules, and CLI usage for generating efficient fused GPU kernels.