PyRank
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About

Dpo Python Packages

Python packages with the GitHub topic dpo. Sorted by relevance, with stars and monthly downloads.
MakazhanAlpamys
soup-cli

Soup turns the pain of LLM fine-tuning into a simple workflow. One config, one command, done.

23K 60 11
oumi-ai
oumi

Easily fine-tune, evaluate and deploy Gemma 4, Qwen3.5, Qwen3.6, gpt-oss, DeepSeek-R1, or any open source LLM / VLM!

3K 9K 765
altaidevorg
afterimage

Generate conversational, tool-calling, structured-output, and preference datasets — easily and at scale

3K 38 1
Goekdeniz-Guelmez
mlx-lm-lora

Train LLMs on Apple silicon with MLX and the Hugging Face Hub

2K 335 42
dannylee1020
openpo

Build high quality synthetic datasets with AI feedback from 200+ LLMs

880 27 0
sail-sg
oat-llm

Online AlignmenT (OAT) for LLMs.

631 653 63
armbues
sillm-mlx

SiLLM simplifies the process of training and running Large Language Models (LLMs) on Apple Silicon by leveraging the MLX framework.

540 286 26
liuxiaotong
knowlyr-sandbox

Gymnasium-style RL framework for LLM agent training — MDP environments, three-layer process reward & SFT/DPO/GRPO policy optimization. CLI + MCP ready.

269 3 0
liuxiaotong
knowlyr-hub

Gymnasium-style RL framework for LLM agent training — MDP environments, three-layer process reward & SFT/DPO/GRPO policy optimization. CLI + MCP ready.

268 3 0
liuxiaotong
knowlyr-recorder

Gymnasium-style RL framework for LLM agent training — MDP environments, three-layer process reward & SFT/DPO/GRPO policy optimization. CLI + MCP ready.

262 3 0
liuxiaotong
knowlyr-reward

Gymnasium-style RL framework for LLM agent training — MDP environments, three-layer process reward & SFT/DPO/GRPO policy optimization. CLI + MCP ready.

254 3 0
liuxiaotong
knowlyr-core

Gymnasium-style RL framework for LLM agent training — MDP environments, three-layer process reward & SFT/DPO/GRPO policy optimization. CLI + MCP ready.

246 3 0
TUDB-Labs
mlora-cli

An Efficient "Factory" to Build Multiple LoRA Adapters

244 378 66
toolbrain
toolbrain

A framework for agentic tool use training with reinforcement learning

209 167 19
warlockee
oxrl

A lightweight post-training framework for LLMs and VLMs

182 17 2
ServiceNow
sygra

Graph-oriented Synthetic data generation Pipeline library

152 81 15
liuxiaotong
knowlyr-trainer

Gymnasium-style RL framework for LLM agent training — MDP environments, three-layer process reward & SFT/DPO/GRPO policy optimization. CLI + MCP ready.

141 3 0
li-plus
flash-pref

Accelerate LLM preference tuning via prefix sharing with a single line of code

116 52 0
    • Data from PyPI, GitHub, ClickHouse, and BigQuery