Rl Python Packages

torchrl

A modular, primitive-first, python-first PyTorch library for Reinforcement Learning.

1.1M 3K 465

sb3-contrib

Contrib package for Stable-Baselines3 - Experimental reinforcement learning (RL) code

406K 727 243

tensorlake

Tensorlake is a serverless runtime for sandboxes and deploying background agentic applications

160K 953 136

tianshou

An elegant PyTorch deep reinforcement learning library.

155K 11K 1K

judgeval

The Continuous-Improvement Stack for Agents. Our environment data and evals power agent improvement and monitoring.

147K 1K 93

gem-llm

A Gym for Agentic LLMs

141K 499 33

neptune

📘 The experiment tracker for foundation model training

136K 622 75

dopamine-rl

Dopamine is a research framework for fast prototyping of reinforcement learning algorithms.

76K 11K 1K

neptune-client

📘 The experiment tracker for foundation model training

68K 622 75

hud-python

RL environments + evals for AI agents. Define once, train anything.

56K 271 61

torchrl-nightly

A modular, primitive-first, python-first PyTorch library for Reinforcement Learning.

45K 3K 465

rliable

[NeurIPS'21 Outstanding Paper] Library for reliable evaluation on RL and ML benchmarks, even with only a handful of seeds.

8K 873 49

torchstudio

Deep Learning Experiment

7K 2 0

flashbax

⚡ Flashbax: Accelerated Replay Buffers in JAX

7K 278 22

rl-zoo3

A training framework for Stable Baselines3 reinforcement learning agents, with hyperparameter optimization and pre-trained agents included.

4K 3K 601

navix

Accelerated minigrid environments with JAX

4K 172 21

multi-agent-rlenv

Strongly typed reinforcement learning environment framework

3K 1 1

amp-rsl-rl

🔁 AMP-RSL-RL: Adversarial Motion Priors for robotic RL (PPO + motion imitation)

3K 329 26

awex

A high-performance RL training-inference weight synchronization framework, designed to enable second-level parameter updates from training to inference in RL workflows

3K 160 18

dsse

The Drone Swarm Search project provides an environment for SAR missions built on PettingZoo, where agents, represented by drones, are tasked with locating targets identified as shipwrecked individuals.

3K 75 14

trianglengin

High-performance C++/Python engine for a triangle puzzle game.

3K 0 0

moonfish

~2000 Elo Python Chess Engine that implements: Negamax, PeSTO’s Evaluation, Null Move, Quiescence Search, Lazy SMP.

2K 26 4

rlox

Rust-accelerated reinforcement learning — 22 algorithms, 2-10x faster than SB3. The Polars of RL.

2K 6 1

areno

An easy-to-use, fast toolkit to scale up RL post-training on a single node.

2K 8 0