PyRank
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About

Model Compression Python Packages

Python packages with the GitHub topic model-compression. Sorted by relevance, with stars and monthly downloads.
tensorflow
tensorflow-model-optimization

A toolkit to optimize ML models for deployment for Keras and TensorFlow, including quantization and pruning.

110K 2K 347
horseee
deepcache

[CVPR 2024] DeepCache: Accelerating Diffusion Models for Free

54K 967 52
Microsoft
nni

An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.

31K 14K 2K
VainF
torch-pruning

[CVPR 2023] DepGraph: Towards Any Structural Pruning; LLMs, Vision Foundation Models, etc.

24K 3K 382
tensorflow
tf-model-optimization-nightly

A suite of tools that users, both novice and advanced can use to optimize machine learning models for deployment and execution.

10K 2K 347
lpalbou
model-quantizer

Effortlessly quantize, benchmark, and publish Hugging Face models with cross-platform support for CPU/GPU. Reduce model size by 75% while maintaining performance.

3K 2 0
kxytechnologies
kxy

A toolkit to boost the productivity of machine learning engineers.

3K 51 12
Picovoice
picollm

On-device LLM Inference Powered by X-Bit Quantization

1K 312 25
FasterAI-Labs
fasterai

FasterAI: Prune and Distill your models with FastAI and PyTorch

994 261 19
Picovoice
picollmdemo

On-device LLM Inference Powered by X-Bit Quantization

916 312 25
666DZY666
micronet

micronet, a model compression and deploy lib. compression: 1、quantization: quantization-aware-training(QAT), High-Bit(>2b)(DoReFa/Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference)、Low-Bit(≤2b)/Ternary and Binary(TWN/BNN/XNOR-Net); post-training-quantization(PTQ), 8-bit(tensorrt); 2、 pruning: normal、regular and group convolutional channel pruning; 3、 group convolution structure; 4、batch-normalization fuse for quantization. deploy: tensorrt, fp32/fp16/int8(ptq-calibration)、op-adapt(upsample)、dynamic_shape

890 2K 474
r-papso
torch-optim

PyTorch models optimization by neural network pruning

867 3 1
aquvitae
aquvitae

Knowledge Distillation Toolkit

865 88 10
Microsoft
nni-daily

Neural Network Intelligence package

660 14K 2K
GeoffreyWang1117
uni-layer

A Universal Framework for Layer Contribution Analysis

604 0 0
gershonc
octopus-ml

A collection of handy ML and data visualization and validation tools. Go ahead and train, evaluate and validate your ML models and data with minimal effort.

521 23 5
SforAiDL
kd-lib

A Pytorch Knowledge Distillation library for benchmarking and extending works in the domains of Knowledge Distillation, Pruning, and Quantization.

521 650 61
Argonaut790
fused-turboquant

Fused Triton kernels for TurboQuant KV cache compression — 2-4 bit quantization with RHT rotation. Drop-in HuggingFace & vLLM integration. Up to 4.9x KV cache compression for Llama, Qwen, Mistral, and more.

449 8 1
tianyic
only-train-once

Only Train Once (OTO): Automatic One-Shot General DNN Training and Compression Framework

448 311 47
musco-ai
musco-pytorch

MUSCO: MUlti-Stage COmpression of neural networks

414 72 17
danhicks96
prismkv

3-D stacked-plane KV cache quantization + RAG framework. Defensive prior-art publication extending TurboQuant to conditional 3-D polar cells with adaptive bit allocation.

360 1 1
microsoft
archai

Accelerate your Neural Architecture Search (NAS) through fast, reproducible and modular research.

313 486 93
m-pektas
bfas

Brute Force Architecture Search

220 4 0
mlzxy
qsparse

train neural networks with joint quantization and pruning on both weights and activations using any pytorch modules

213 42 2
    • Data from PyPI, GitHub, ClickHouse, and BigQuery