Mixture Of Experts Python Packages

nvidia-cudnn-frontend

cuDNN Frontend is NVIDIA's modern, open-source entry point to the cuDNN library and a growing collection of high-performance open-source kernels.

3.4M 862 197

deepspeed

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

1.4M 43K 5K

egobox

Efficient global optimization toolbox in Rust: bayesian optimization, mixture of gaussian processes, sampling methods

44K 182 12

smt

SMT: The Surrogate Modeling Toolbox

40K 895 230

terradev-cli

An imperative command-line-interface for AI workload orchestration

20K 21 3

optillm

Optimizing inference proxy for LLMs

10K 4K 368

sawyer-core

Distributed MoE inference network — the load is split, friends help.

7K 0 0

krasis

Krasis is a Hybrid LLM runtime which focuses on efficient running of larger models on consumer grade VRAM limited hardware

6K 477 27

hivemind

Decentralized deep learning in PyTorch. Built to train models on thousands of volunteers across the world.

3K 2K 230

switch-transformers

Implementation of Switch Transformers from the paper: "Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity"

2K 142 18

peer-pytorch

Pytorch implementation of the PEER block from the paper, Mixture of A Million Experts, by Xu Owen He at Deepmind

2K 137 6

mixture-of-experts

A Pytorch implementation of Sparsely-Gated Mixture of Experts, for massively increasing the parameter count of language models

2K 863 71

dissenter

A PyPI package implementing a multi-LLM ensemble debate framework. Routes prompts across Claude, GPT, Gemini, Ollama, and any LiteLLM-compatible provider; surfaces disagreements across models and synthesizes a consensus answer.

2K 1 0

abliterix

Automated alignment adjustment for LLMs — direct steering, LoRA, and MoE expert-granular abliteration, optimized via multi-objective Optuna TPE.

2K 172 35

st-moe-pytorch

Implementation of ST-Moe, the latest incarnation of MoE after years of research at Brain, in Pytorch

1K 385 34

soft-moe-pytorch

Implementation of Soft MoE, proposed by Brain's Vision team, in Pytorch

868 347 10

mlx-flash

Run AI models too large for your Mac's memory — at near-full speed. Intelligent expert caching, speculative execution, and 15+ research techniques for MoE inference on Apple Silicon.

771 5 0

sinkhorn-router-pytorch

Self contained pytorch implementation of a sinkhorn based router, for mixture of experts or otherwise

684 40 0

pytorch-mixtures

One-stop solutions for Mixture of Expert modules in PyTorch.

536 28 1

mlxlmprobe

Universal probing and interpretability tool for MLX language models on Apple Silicon

514 5 0

mixture-of-attention

Some personal experiments around routing tokens to different autoregressive attention, akin to mixture-of-experts

509 122 5

lmxlab

Transformer language models on Apple Silicon with MLX

356 1 0

tracedistill

Distill teacher chains-of-thought into a LoRA adapter via a strict boxed-answer format contract and a two-phase Train→Nudge schedule (silver-medal NVIDIA Nemotron reasoning recipe).

352 0 0

mergoo

A library for easily merging multiple LLM experts, and efficiently train the merged LLM.

277 518 33