Tao of RWD Blog

Yunzhe (Jeff) Zhou, Aimee Harrison, Andy Wilson

Dynamic Treatment Regimes with RL, Part II: Optimizing Pessimism

An introduction to the principle from offline reinforcement learning and how it can be applied to dynamic treatment regime estimation to produce treatment recommendations that account for model uncertainty using a Bayesian learning approach.

14 March 2026
Reinforcement Learning
Dynamic Treatment Regimes with RL, Part II: Optimizing Pessimism

MaryLena Bleile, Aimee Harrison, Andy Wilson

Dynamic Treatment Regimes with RL, Part I

This article introduces dynamic treatment regimes (DTRs) as a bridge between causal inference and reinforcement learning (RL), showing how sequential clinical decisions can be optimized using Q-learning and backward induction while acknowledging the challenge of model unreliability with sparse data.

28 February 2026
Dynamic Treatment Regimes with RL, Part I