PyRank
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About

Multimodal Large Language Models Python Packages

Python packages with the GitHub topic multimodal-large-language-models. Sorted by relevance, with stars and monthly downloads.
Moenupa
deocr

A high-performance highly-customizable reverse OCR tool that renders text or huggingface-compatible datasets to images. Dimension, DPI, CSS configurable!

712 2 0
rese1f
moviechat

Long video understanding

436 693 42
zhudotexe
kani-multimodal-core

Core shared libraries for multimodal Kani extensions.

408 2 0
multimind-dev
multimind-sdk

Your SDK solves all of this. One interface. Unified logic. Local + hosted models. Fine-tuning. Agent tools. Enterprise-ready. Hybrid RAG.Star 🌟 if you like it!

191 92 14
Video-Bench
videobench

Video Generation Benchmark

166 79 4
thisisiron
vt-calc

🧮 Calculator for vision tokens in VLMs.

165 1 0
HA-Video-Bench
habench

Video Generation Benchmark

164 80 4
zjunlp
deco-mllm

[ICLR 2025] MLLM can see? Dynamic Correction Decoding for Hallucination Mitigation

130 144 13
bcdnlp
faithscore

FaithScore: Fine-grained Evaluations of Hallucinations in Large Vision-Language Models

94 33 7
    • Data from PyPI, GitHub, ClickHouse, and BigQuery