Topic
Reinforcement Learning Security
How reward functions, agents, environments, and evaluators can be gamed, exploited, or misaligned under optimization pressure.
Topic
How reward functions, agents, environments, and evaluators can be gamed, exploited, or misaligned under optimization pressure.