PyRank
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About

Vision And Language Python Packages

Python packages with the GitHub topic vision-and-language. Sorted by relevance, with stars and monthly downloads.
zhuang-li
factualscenegraph

[ACL 2023 Findings] FACTUAL dataset, the textual scene graph parser trained on FACTUAL.

23K 127 12
batmanlab
mammoclip

[MICCAI 2024, top 11%] Official Pytorch implementation of Mammo-CLIP: A Vision Language Foundation Model to Enhance Data Efficiency and Robustness in Mammography

2K 93 34
cdancette
multimodal

A collection of multimodal datasets multimodal for research.

642 83 8
roboflow
maestro

streamline the fine-tuning process for multimodal models: PaliGemma 2, Florence-2, and Qwen2.5-VL

498 3K 222
roboflow
multimodal-maestro

streamline the fine-tuning process for multimodal models: PaliGemma 2, Florence-2, and Qwen2.5-VL

343 3K 222
naver-ai
eccv-caption

Extended COCO Validation (ECCV) Caption dataset (ECCV 2022)

338 56 2
roboflow
setofmark

streamline the fine-tuning process for multimodal models: PaliGemma 2, Florence-2, and Qwen2.5-VL

225 3K 222
ELkarousWissem
modu-muse

Modular + Muse, for inspiring multimodal intelligence.

204 3 0
mozuma
mozuma

Model Zoo for Multimedia Applications

180 8 1
om-ai-lab
omagent-core

[EMNLP-2024] Build multimodal language agents for fast prototype and production

167 3K 288
    • Data from PyPI, GitHub, ClickHouse, and BigQuery