This paper reviews a variety of ways to use trajectory optimization to accelerate dynamic programming. Dynamic programming provides a way to design globally optimal control laws for nonlinear systems. However, the curse of dimensionality, the exponential dependence of space and computation resources needed on the dimensionality of the state and control, limits the application of dynamic programming in practice. We explore trajectory-based dynamic programming, which combines many local optimizations to accelerate the global optimization of dynamic programming.What is Dynamic Programming? Dynamic programming provides a way to find globally optimal control laws (policies), u = u(x), which give the appropriate action u for any state x [1,2]. Dynamic programming takes as input a one step cost (a.k.a. "reward" or "loss") function and the dynamics of the problem to be optimized. This paper focuses on offline planning of nonlinear control laws for control problems with continuous states and actions, deterministic time invariant discrete time dynamics x k+1 = f(x k , u k ), and a time invariant one step cost function L(x, u), so we use discrete time dynamic programming. We are focusing on steady state policies and thus an infinite time horizon.One approach to dynamic programming is to approximate the value function V (x) (the optimal total future cost from each state , u))) at sampled states x j until the value function estimates have converged. Typically the value function and control law are represented on a regular grid. Some type of interpolation is used to approximate these functions within each grid cell. If each dimension of the state and action is represented with a resolution R, and the dimensionality of the state is d x and that of the action is d u , the computational cost of the conventional approach is proportional to R d x × R d u and the memory cost is proportional to R d x . This is known as the Curse of Dimensionality [1].An example problem: We use one link pendulum swingup as an example problem to provide the reader with a visualizable example of a nonlinear control law and corresponding value function. In one link pendulum swingup a motor at the base of the pendulum swings a rigid arm from the downward stable equilibrium to the upright unstable equilibrium and balances the arm there (Fig. 1). What makes this challenging is that a one step cost function penalizes the amount of torque used and the deviation of the current position from the goal. The controller must try to minimize the total cost of the trajectory. The one step cost function for this example is a weighted sum of the squared position errors (θ: difference between current angle and the goal angle) and the squared torques τ: L(x, u) = 0.1θ 2 + τ 2 where 0.1 weights the position error relative to the torque penalty. There are no costs associated with the joint velocity. The uniform density link has a mass of 1kg, length of 1m, and width of 0.1m. Because the dynamics and cost function are time invariant, there is a steady state control law and...