In this chapter we extend the ADP algorithm, Dual Heuristic Programming (DHP), to include a "bootstrapping" parameter λ, analogous to that used in the Reinforcement Learning algorithm TD(λ). The resulting algorithm, which we call VGL(λ) for value-gradient learning, is proven to produce a weight update that can be equivalent to backpropagation through time (BPTT) applied to a greedy policy on a critic-function.This provides a surprising connection between the two alternative methods of BPTT and DHP. Under certain smoothness conditions, VGL(λ = 1) with a greedy policy acquires strong convergence conditions of BPTT, while using a general function