Skip to content

Revisiting Temporal Difference Learning

Published: at 11:10 PMSuggest Changes

Abstract

This report revisits Richard S. Sutton’s seminal TD(λ)\mathrm{TD}(\lambda) methods by replicating the experiments from his 1988 paper on the random walk prediction problem. We evaluate the robustness and applicability of TD learning through a comparative analysis of supervised learning and TD(λ)\operatorname{TD}(\lambda) strategies. Our findings confirm the efficacy of TD(λ)\operatorname{TD}(\lambda) in learning from temporal differences and adapting to partial information. Adjustments in learning parameters like rate and convergence thresholds highlight their impact on learning outcomes, especially the influence of λ\lambda values on prediction accuracy and efficiency. This study supports the foundational principles of TD learning and corroborates its relevance through rigorous empirical validation.

Full Paper

Revisiting Temporal Difference Learning

Expand to read the PDF.

Unable to load the inline viewer? Open in a new tab instead.