PyRank
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About

Apache Arrow Python Packages

Python packages with the GitHub topic apache-arrow. Sorted by relevance, with stars and monthly downloads.
aws
awswrangler

pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, Neptune, OpenSearch, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).

86.3M 4K 729
scikit-hep
awkward

Manipulate JSON-like data with NumPy-like idioms.

1.4M 959 123
scikit-hep
awkward-cpp

Manipulate JSON-like data with NumPy-like idioms.

1.2M 959 123
scikit-hep
awkward0

Manipulate arrays of complex data structures as easily as Numpy.

262K 214 39
cldellow
parquet-metadata

Dump metadata about a Parquet file.

205K 11 2
mongodb-labs
pymongoarrow

MongoDB integrations for Apache Arrow. Export MongoDB documents to numpy array, parquet files, and pandas dataframes in one line of code.

97K 114 19
influxdata
flightsql-dbapi

DB API 2 interface for Flight SQL with SQLAlchemy extras.

59K 43 6
developmentseed
lonboard

Fast, interactive geospatial data visualization in Jupyter.

40K 945 52
scikit-hep
awkward1

Manipulate JSON-like data with NumPy-like idioms.

33K 959 123
tradewelltech
protarrow

Convert from protobuf to arrow and back

27K 40 6
PSU3D0
formualizer

Embeddable spreadsheet engine — parse, evaluate & mutate Excel workbooks from Rust, Python, or the browser. Arrow-powered, 320+ functions.

22K 129 14
nanoporetech
lib-pod5

Pod5: a high performance file format for nanopore reads.

20K 174 37
columnar-tech
dbc

dbc is the command-line tool for installing and managing ADBC drivers

17K 109 10
nanoporetech
pod5

Pod5: a high performance file format for nanopore reads.

15K 174 37
AndreaBozzo
dataprof

Library and CLI for profiling tabular data

15K 14 1
abdenlab
oxbow

Oxbow makes genomic data ready for high-performance analytics.

7K 153 15
Query-farm
vgi-rpc

Transport-agnostic RPC framework built on Apache Arrow IPC serialization. Define RPC interfaces as Python Protocol classes with automatic schema derivation, typed client proxies, and streaming support.

7K 10 0
hypertopos
hypertopos

Understand the structure of your data — without training machine learning models

5K 2 0
bug-ops
pyhdb-rs

SAP HANA meets modern Python. Rust-powered driver with zero-copy Arrow, native Polars/pandas support, async pooling. Includes MCP server for AI assistants.

5K 1 0
tradewelltech
beavers

Python stream processing for analytics

4K 41 2
scikit-hep
awkward-numba

Manipulate arrays of complex data structures as easily as Numpy.

3K 214 39
arrowjet
arrowjet

The fastest way to move data in and out of database.

3K 1 1
rpy2
rpy2-arrow

Share Apache Arrow datasets between Python and R.

3K 19 4
mluttikh
xml2arrow

Efficiently convert XML data to Apache Arrow format.

3K 1 0
    • Data from PyPI, GitHub, ClickHouse, and BigQuery