PyRank
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About

Multi Modal Python Packages

Python packages with the GitHub topic multi-modal. Sorted by relevance, with stars and monthly downloads.
modelscope
modelscope

ModelScope: bring the notion of Model-as-a-Service to life.

4.5M 9K 943
agentscope-ai
agentscope

Build and run agents you can see, understand and trust.

236K 25K 3K
docarray
docarray

Represent, send, store and search multimodal data

129K 3K 243
MedMNIST
medmnist

[pip install medmnist] 18x Standardized Datasets for 2D and 3D Biomedical Image Classification

46K 1K 207
awslabs
pyrhubarb

A Python framework for multi-modal document understanding with Amazon Bedrock

44K 103 14
valhalla
pyvalhalla

Open Source Routing Engine for OpenStreetMap

6K 6K 894
OFA-Sys
cn-clip

Chinese version of CLIP which achieves Chinese cross-modal retrieval and representation generation.

6K 6K 550
datajuicer
py-data-juicer

Data processing for and with foundation models! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷

5K 6K 371
lucidrains
dalle-pytorch

Implementation / replication of DALL-E, OpenAI's Text to Image Transformer, in Pytorch

5K 6K 643
answerdotai
byaldi

Use late-interaction multi-modal models such as ColPali in just a few lines of code.

4K 847 93
lucidrains
transfusion-pytorch

Pytorch implementation of Transfusion, "Predict the Next Token and Diffuse Images with One Multi-Modal Model", from MetaAI

4K 1K 72
kyegomez
qwen

My personal implementation of the model from "Qwen-VL: A Frontier Large Vision-Language Model with Versatile Abilities", they haven't released model code yet sooo...

3K 13 3
avitai
avitai-artifex

A research-focused modular generative modeling library built on JAX/Flax NNX

3K 1 0
valhalla
pyvalhalla-weekly

Open Source Routing Engine for OpenStreetMap

2K 6K 894
zjunlp
deepke

DeepKE is a knowledge extraction toolkit for knowledge graph construction supporting low-resource, document-level and multimodal scenarios for entity, relation and attribute extraction.

2K 4K 744
BrainLesion
brainles-preprocessing

preprocessing tools for multi-modal 3D brain imaging

2K 33 8
kyegomez
vision-llama

Implementation of VisionLLaMA from the paper: "VisionLLaMA: A Unified LLaMA Interface for Vision Tasks" in PyTorch and Zeta

2K 15 0
kyegomez
switch-transformers

Implementation of Switch Transformers from the paper: "Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity"

1K 141 18
johndef64
mychatgpt

mychatgpt is a small and useful Python package that provides utils to create OpenAI's GPT conversational agents. This module allows users to have interactive chat with GPT models and keeps track of the chat history. Useful in Python projects as Copilot agent.

658 5 0
RasmussenLab
move-dl

Multi-omics variational autoencoder

560 96 33
kyegomez
tiny-gptv

Tiny GPTV - Pytorch

543 16 0
lucidrains
dalle-pytorch-dev

DALL-E - Pytorch

491 6K 643
kyegomez
mm1-torch

PyTorch Implementation of the paper "MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training"

476 28 1
kyegomez
simba-torch

A simpler Pytorch + Zeta Implementation of the paper: "SiMBA: Simplified Mamba-based Architecture for Vision and Multivariate Time series"

451 29 2
    • Data from PyPI, GitHub, ClickHouse, and BigQuery