PyRank
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About

Dataset Generator Python Packages

Python packages with the GitHub topic dataset-generator. Sorted by relevance, with stars and monthly downloads.
StarlangSoftware
nlptoolkit-datagenerator

Classification dataset generator library for high level Nlp tasks

1K 3 0
StarlangSoftware
nlptoolkit-datagenerator-cy

Classification dataset generator library for high level Nlp tasks

967 0 0
OmarSamirz
iftg

IFTG (ImageFromTextGenerator) is a Python package that simplifies creating robust datasets for OCR models. Generate images from text, apply over 10 built-in noise effects, and customize fonts and layouts. IFTG supports all languages and offers endless noise combinations, including custom noise creation.

439 21 2
hygull
kaggle-dataset-creator

A Python package to that allows Data scientist, Data engineer, Data analyst to create a dataset in form of csv, json so that they could be either submitted to Kaggle's dataset collection or used to work with Pandas etc.

405 0 0
thehetpandya
youtube-tts-data-generator

A python library to generate speech dataset from Youtube videos

362 37 8
juliensimon
open-agent-traces

Generate realistic multi-agent workflow traces with LLM-enriched content, semantic validation, and PM4Py compatibility. pip install open-agent-traces

217 16 3
M-Farag
rawbuilder

an elegant datasets factory

181 5 0
wSanice
leblanc

leblanc is a modular Python library designed for the rapid generation of large-scale synthetic datasets across various business sectors. It is primarily built using Pandas, NumPy, and Faker to create realistic, structured DataFrames suitable for Data Science training, testing, and exploratory data analysis (EDA).

181 1 0
    • Data from PyPI, GitHub, ClickHouse, and BigQuery