Tao of RWD Blog

Reinforcement Learning

Yunzhe (Jeff) Zhou, Aimee Harrison, Andy Wilson

Dynamic Treatment Regimes with RL, Part II: Optimizing Pessimism

An introduction to the principle from offline reinforcement learning and how it can be applied to dynamic treatment regime estimation to produce treatment recommendations that account for model uncertainty using a Bayesian learning approach.

14 March 2026

Reinforcement Learning

Dynamic Treatment Regimes with RL, Part II: Optimizing Pessimism