off-policy-evaluation
Open Bandit Pipeline: a python library for bandit algorithms and off-policy evaluation
Implementations and examples of common offline policy evaluation methods in Python.
SCOPE-RL: A pipeline for offline reinforcement learning research and applications