PyRank
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About

Data Science Python Packages

Python packages with the GitHub topic data-science. Sorted by relevance, with stars and monthly downloads.
pandas-dev
pandas

Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more

703.2M 49K 20K
matplotlib
matplotlib

matplotlib: plotting with Python

218.8M 23K 8K
scikit-learn
scikit-learn

scikit-learn: machine learning in Python

208.6M 66K 27K
ipython
ipython

Official repository for IPython itself. Other repos in the IPython organization contain things like the website, documentation builds, etc.

164.8M 17K 4K
aws
awswrangler

pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, Neptune, OpenSearch, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).

86.3M 4K 729
pymupdf
pymupdf

PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.

79.5M 10K 726
snowflakedb
snowflake-snowpark-python

Snowflake Snowpark Python API

66.8M 333 147
ray-project
ray

Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.

56.1M 43K 8K
mwaskom
seaborn

Statistical data visualization in Python

51.7M 14K 2K
modal-labs
modal

SDK libraries for Modal

51.3M 473 95
aws
redshift-connector

Redshift Python Connector. It supports Python Database API Specification v2.0.

48.7M 218 87
apache
apache-airflow-providers-common-sql

Apache Airflow - A platform to programmatically author, schedule, and monitor workflows

45.9M 45K 17K
statsmodels
statsmodels

Statsmodels: statistical modeling and econometrics in Python

36.7M 11K 3K
great-expectations
great-expectations

Always know what to expect from your data.

31.3M 12K 2K
apache
apache-airflow-providers-fab

Apache Airflow - A platform to programmatically author, schedule, and monitor workflows

29.9M 45K 17K
supabase
realtime

Python Client for Supabase. Query Postgres from Flask, Django, FastAPI. Python user authentication, security policies, edge functions, file storage, and realtime data streaming. Good first issue.

29.5M 3K 484
streamlit
streamlit

Streamlit — A faster way to build and share data apps.

29.2M 45K 4K
wandb
wandb

The AI developer platform. Use Weights & Biases to train and fine-tune models, and manage models from experimentation to production.

26.2M 11K 872
supabase
supabase

Python Client for Supabase. Query Postgres from Flask, Django, FastAPI. Python user authentication, security policies, edge functions, file storage, and realtime data streaming. Good first issue.

25.1M 3K 484
supabase
storage3

Python Client for Supabase. Query Postgres from Flask, Django, FastAPI. Python user authentication, security policies, edge functions, file storage, and realtime data streaming. Good first issue.

23.7M 3K 484
supabase
postgrest

Python Client for Supabase. Query Postgres from Flask, Django, FastAPI. Python user authentication, security policies, edge functions, file storage, and realtime data streaming. Good first issue.

23.3M 3K 484
apache
apache-airflow-providers-http

Apache Airflow - A platform to programmatically author, schedule, and monitor workflows

22.6M 45K 17K
apache
apache-airflow-providers-databricks

Apache Airflow - A platform to programmatically author, schedule, and monitor workflows

22.1M 45K 17K
explosion
spacy

💫 Industrial-strength Natural Language Processing (NLP) in Python

21.6M 34K 5K
    • Data from PyPI, GitHub, ClickHouse, and BigQuery