A stochastic differential equation model is introduced for linear TD(0) learning under Markovian noise. It separates contraction dynamics from sampling effects and explains the error floor via interaction between long-run covariance and the projected Bellman operator's geometry.
Diffusion Approximation for TD Learning with Linear Features
from English