cupy/cupy

GitHub Trending (daily) Tools

Summary

CuPy is a GPU-accelerated library that serves as a drop-in replacement for NumPy/SciPy, enabling efficient array operations on NVIDIA CUDA and AMD ROCm platforms.

NumPy & SciPy for GPU
Original Article
View Cached Full Text

Cached at: 06/28/26, 11:18 AM

cupy/cupy

Source: https://github.com/cupy/cupy

CuPy : NumPy & SciPy for GPU

pypi Conda GitHub license Matrix Twitter Medium

Website | Install | Tutorial | Examples | Documentation | API Reference | Forum

CuPy is a NumPy/SciPy-compatible array library for GPU-accelerated computing with Python. CuPy acts as a drop-in replacement to run existing NumPy/SciPy code on NVIDIA CUDA or AMD ROCm platforms.

>>> import cupy as cp
>>> x = cp.arange(6).reshape(2, 3).astype('f')
>>> x
array([[ 0.,  1.,  2.],
       [ 3.,  4.,  5.]], dtype=float32)
>>> x.sum(axis=1)
array([  3.,  12.], dtype=float32)

CuPy also provides access to low-level CUDA features. You can pass ndarray to existing CUDA C/C++ programs via RawKernels, use Streams for performance, or even call CUDA Runtime APIs directly.

Installation

Pip

Binary packages (wheels) are available for Linux and Windows on PyPI. Choose the right package for your platform.

PlatformArchitectureCommand
CUDA 12.xx86_64 / aarch64pip install cupy-cuda12x
CUDA 13.xx86_64 / aarch64pip install cupy-cuda13x
ROCm 7.0 (experimental)x86_64pip install cupy-rocm-7-0

[!NOTE]
To install pre-releases, append --pre -U -f https://pip.cupy.dev/pre (e.g., pip install cupy-cuda12x --pre -U -f https://pip.cupy.dev/pre).

Conda

Binary packages are also available for Linux and Windows on Conda-Forge.

PlatformArchitectureCommand
CUDAx86_64 / aarch64 / ppc64leconda install -c conda-forge cupy

If you need a slim installation (without also getting CUDA dependencies installed), you can do conda install -c conda-forge cupy-core.

If you need to use a particular CUDA version (say 12.0), you can use the cuda-version metapackage to select the version, e.g. conda install -c conda-forge cupy cuda-version=12.0.

[!NOTE]
If you encounter any problem with CuPy installed from conda-forge, please feel free to report to cupy-feedstock, and we will help investigate if it is just a packaging issue in conda-forge’s recipe or a real issue in CuPy.

Docker

Use NVIDIA Container Toolkit to run CuPy container images.

$ docker run --gpus all -it cupy/cupy

Resources

1

cuSignal is now part of CuPy starting v13.0.0.

License

MIT License (see LICENSE file).

CuPy is designed based on NumPy’s API and SciPy’s API (see docs/source/license.rst file).

CuPy is being developed and maintained by Preferred Networks and community contributors.

Reference

Ryosuke Okuta, Yuya Unno, Daisuke Nishino, Shohei Hido and Crissman Loomis. CuPy: A NumPy-Compatible Library for NVIDIA GPU Calculations. Proceedings of Workshop on Machine Learning Systems (LearningSys) in The Thirty-first Annual Conference on Neural Information Processing Systems (NIPS), (2017). [PDF]

@inproceedings{cupy_learningsys2017,
  author       = "Okuta, Ryosuke and Unno, Yuya and Nishino, Daisuke and Hido, Shohei and Loomis, Crissman",
  title        = "CuPy: A NumPy-Compatible Library for NVIDIA GPU Calculations",
  booktitle    = "Proceedings of Workshop on Machine Learning Systems (LearningSys) in The Thirty-first Annual Conference on Neural Information Processing Systems (NIPS)",
  year         = "2017",
  url          = "http://learningsys.org/nips17/assets/papers/paper_16.pdf"
}

Similar Articles

Faster physics in Python

OpenAI Blog

OpenAI open-sources mujoco-py, a high-performance Python library for robotic simulation using the MuJoCo engine, featuring ~40x speedup with headless GPU rendering and VR interaction support.

CUDA-oxide: Nvidia's official Rust to CUDA compiler

Hacker News Top

CUDA-oxide is an experimental Rust-to-CUDA compiler developed by NVIDIA that enables writing safe GPU kernels in idiomatic Rust, compiling directly to PTX without requiring domain-specific languages or foreign bindings.

Show HN: cuTile Rust: Safe, data-race-free GPU kernels in Rust

Hacker News Top

NVIDIA Labs releases cuTile Rust, a tile-based system for writing memory-safe, data-race-free GPU kernels in idiomatic Rust. It extends Rust's ownership model to GPU kernels, JIT-compiles Rust AST to GPU code, and achieves performance close to native CUDA.