Qwen3 Python Packages

vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

5.8M 86K 19K

ms-swift

Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs (Qwen3.6, DeepSeek-V4, GLM-5.1, InternLM3, Llama4, ...) and 300+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, GLM4.5v, Gemma4, Llava, Phi4, ...) (AAAI 2025).

104K 15K 2K

vllm-tpu

A high-throughput and memory-efficient inference and serving engine for LLMs

58K 86K 19K

hud-python

RL environments + evals for AI agents. Define once, train anything.

56K 272 61

nemo-automodel

🚀 Pytorch Distributed native training library for LLMs/VLMs with OOTB Hugging Face support

25K 677 200

qwen3-embed

Lightweight Qwen3 text embedding and reranking via ONNX Runtime and GGUF

18K 4 0

vllm-cpu-nightly

A high-throughput and memory-efficient inference and serving engine for LLMs

16K 86K 19K

steadytext

Deterministic text generation and embeddings with zero configuration

2K 44 3

echoalign-asr-mlx

Local Apple Silicon CLI for ASR, subtitles, WebVTT/SRT, and timestamp-aligned JSON with MLX + Qwen3

2K 1 0

kakeyalattice

Discrete Kakeya cover for LLM KV cache: D4/E8 nested-lattice quantisation realising a Kakeya-style tube-cover over the direction sphere. 2.4x-2.8x compression at <1% perplexity loss on Qwen3, Llama-3, DeepSeek, GLM-4, Gemma. Drop-in transformers.DynamicCache. pip install kakeyalattice.

1K 9 2

taiwan-asr-toolkit

Production-grade Traditional Chinese / Taiwan Mandarin speech-to-text. Qwen3-ASR + MediaTek Breeze-ASR-25, hot-word injection, LLM polish, speaker diarization. RTF up to 1554x on RTX 5090, 56 TDD tests.

873 2 0

ai-dynamo-vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

606 86K 19K

vllm-acc

A high-throughput and memory-efficient inference and serving engine for LLMs

575 86K 19K

sixfinger

SixFinger API - Free Claude Api - 10-20x Faster AI Chat API - 40+ models

567 20 4

vllm-xft

A high-throughput and memory-efficient inference and serving engine for LLMs

506 86K 19K

german-ocr

German-OCR is specifically trained to extract text from German documents including invoices, receipts, forms, and other business documents.

500 112 7

ggufloader

GGUF Loader with its Agentic Mode, and floating button, ai Models | Open Source & Offline. Mistral, Deepseek, llama, gemma, qwen

441 57 12

openclaw-ontology-engine

Declarative governance + tool-ontology engine for Agent Runtime control planes — bring your own YAML (Layer 2), get tool governance + semantic query + governance audit (Layer 1).

433 10 1

vllm-musa

vLLM platform plugin for Moore Threads MUSA GPUs

426 86K 19K

deepsearcher

Open Source Deep Research Alternative to Reason and Search on Private Data. Written in Python.

419 8K 764

vllm-consul

A high-throughput and memory-efficient inference and serving engine for LLMs

407 86K 19K

vllm-npu

A high-throughput and memory-efficient inference and serving engine for LLMs

401 86K 19K

nextai-vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

399 86K 19K

metascreener

Open-source multi-LLM ensemble tool for systematic review workflows

364 1K 47