PyRank
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About

Data Augmentation Python Packages

Python packages with the GitHub topic data-augmentation. Sorted by relevance, with stars and monthly downloads.
webdataset
webdataset

A high-performance Python-based I/O system for large (and small) deep learning problems, with strong support for PyTorch.

2.1M 3K 232
asteroid-team
torch-audiomentations

Fast audio data augmentation in PyTorch. Inspired by audiomentations. Useful for deep learning.

1.8M 1K 100
akiomik
pilgram

A python library for instagram filters

259K 131 17
iver56
audiomentations

A Python library for audio data augmentation. Useful for making audio ML models work well in the real world, not just in the lab.

194K 2K 219
TorchIO-project
torchio

Medical imaging processing for AI applications.

96K 2K 262
snorkel-team
snorkel

A system for quickly generating training data with weak supervision

68K 6K 854
QData
textattack

TextAttack 🐙 is a Python framework for adversarial attacks, data augmentation, and model training in NLP https://textattack.readthedocs.io/en/master/

28K 3K 446
NVIDIA
nvidia-dali-cuda120

A GPU-accelerated library containing highly optimized building blocks and an execution engine for data processing to accelerate deep learning training and inference applications.

28K 6K 665
bethgelab
imagecorruptions

Python package to corrupt arbitrary images.

25K 472 73
iver56
numpy-audio-limiter

Simple audio limiter. Made for use in audiomentations.

23K 8 0
iver56
fast-mp3-augment

Fast Python library for MP3 audio data augmentation (encode + decode for intentional audio quality degradation). Made for use in audiomentations.

22K 6 0
albumentations-team
albumentationsx

Next-generation Albumentations: dual-licensed for open-source and commercial use

18K 325 28
webdataset
wids

A high-performance Python-based I/O system for large (and small) deep learning problems, with strong support for PyTorch.

13K 3K 232
sparkfish
augraphy

Augmentation pipeline for rendering synthetic paper printing, faxing, scanning and copy machine processes

8K 536 61
zoj613
polyagamma

An efficient and flexible sampler of the Pólya-Gamma distribution with a NumPy/SciPy compatible interface.

8K 27 5
visualdatabase
fastdup

fastdup is a powerful, free tool designed to rapidly generate valuable insights from image and video datasets. It helps enhance the quality of both images and labels, while significantly reducing data operation costs, all with unmatched scalability.

7K 2K 87
gvtulder
elasticdeform

Differentiable elastic deformations for N-dimensional images (Python, SciPy, NumPy, TensorFlow, PyTorch).

4K 195 26
NVIDIA
nvidia-dali-cuda110

A GPU-accelerated library containing highly optimized building blocks and an execution engine for data processing to accelerate deep learning training and inference applications.

4K 6K 665
DeepTrackAI
deeptrack

DeepTrack2 is a modular Python library for generating, manipulating, and analyzing image data pipelines for machine learning and experimental imaging.

4K 238 62
NVIDIA
nvidia-dali-nightly-cuda120

A GPU-accelerated library containing highly optimized building blocks and an execution engine for data processing to accelerate deep learning training and inference applications.

3K 6K 665
vkit-x
vkit-nightly

Boosting Document Intelligence

3K 23 1
NVIDIA
nvidia-dali-nightly-cuda110

A GPU-accelerated library containing highly optimized building blocks and an execution engine for data processing to accelerate deep learning training and inference applications.

3K 6K 665
justinsalamon
scaper

A library for soundscape synthesis and augmentation

3K 423 69
NVIDIA
nvidia-dali-cuda130

A GPU-accelerated library containing highly optimized building blocks and an execution engine for data processing to accelerate deep learning training and inference applications.

2K 6K 665
    • Data from PyPI, GitHub, ClickHouse, and BigQuery