Reward Shaping Python Packages

host-pytorch

Implementation of Humanoid Standing Up, from the paper "Learning Humanoid Standing-up Control across Diverse Postures" out of Shanghai, in Pytorch

3K 46 5

verdict

Inference-time scaling for LLMs-as-a-judge.

996 345 28

goodhart

Catch reward traps before training. Static analysis for RL reward functions.

353 0 0

opfgym

A gymnasium-compatible framework to create reinforcement learning (RL) environment for solving the optimal power flow (OPF) problem. Contains five OPF benchmark environments for comparable research.

316 30 3

shaner

Reward shaping library

165 3 1