Themis: An explainable AI-enabled framework for Reinforcement Learning with Human Feedback
The authors introduce Themis, an XAI-enabled testing and evaluation framework that combines transparency through explainability with alignment via human feedback for safe Reinforcement Learning systems.