PyRank
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About

Cuda Python Packages

Python packages with the GitHub topic cuda. Sorted by relevance, with stars and monthly downloads.
sgl-project
sglang

SGLang is a high-performance serving framework for large language models and multimodal models.

304.1M 28K 6K
numba
numba

NumPy aware dynamic Python compiler using LLVM

65.5M 11K 1K
NVIDIA
nvidia-nccl-cu12

Optimized primitives for collective multi-GPU communication

48.4M 5K 1K
OpenNMT
ctranslate2

Fast inference engine for Transformer models

8.4M 4K 487
catboost
catboost

A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.

6.3M 9K 1K
vllm-project
vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

6.3M 81K 17K
flashinfer-ai
flashinfer-python

FlashInfer: Kernel Library for LLM Serving

5.2M 6K 977
NVIDIA
nvidia-cutlass-dsl

CUDA Templates and Python DSLs for High-Performance Linear Algebra

4.9M 10K 2K
NVIDIA
nvidia-cutlass-dsl-libs-base

CUDA Templates and Python DSLs for High-Performance Linear Algebra

4.2M 10K 2K
flashinfer-ai
flashinfer-cubin

FlashInfer: Kernel Library for LLM Serving

3.8M 6K 977
pytorch
torchao

PyTorch native quantization and sparsity for training and inference

3.7M 3K 505
NVIDIA
nvidia-cudnn-frontend

cudnn_frontend provides a c++ wrapper for the cudnn backend API and samples on how to use it

3.5M 723 153
cupy
cupy-cuda12x

NumPy & SciPy for GPU

2.9M 11K 1K
replicate
cog

Containers for machine learning

2.4M 9K 686
meta-pytorch
torchrec

Pytorch domain library for recommendation systems

2.3M 3K 644
isl-org
open3d

Open3D: A Modern Library for 3D Data Processing

1.5M 14K 3K
NVIDIA
warp-lang

A Python framework for GPU-accelerated simulation, robotics, and machine learning.

1.5M 7K 509
PennyLaneAI
pennylane-lightning

The Lightning plugin ecosystem provides fast quantum state-vector and tensor network simulators written in C++ for use with PennyLane.

748K 136 54
PASSIONLab
openequivariance

OpenEquivariance: a fast, open-source GPU JIT kernel generator for the Clebsch-Gordon Tensor Product.

629K 143 9
XuehaiPan
nvitop

An interactive NVIDIA-GPU process viewer and beyond, the one-stop solution for GPU process management.

444K 7K 234
cupy
cupy-cuda13x

NumPy & SciPy for GPU

375K 11K 1K
sgl-project
sglang-kernel

SGLang is a high-performance serving framework for large language models and multimodal models.

326K 28K 6K
sgl-project
sgl-kernel

SGLang is a high-performance serving framework for large language models and multimodal models.

302K 28K 6K
rapidsai
pylibcudf-cu12

cuDF - GPU DataFrame Library

198K 10K 1K
    • Data from PyPI, GitHub, ClickHouse, and BigQuery