This article presents a modern, data-driven, reinforcement learning-based (RL-based), discrete-time control methodology for power electronic converters. Additionally, the key advantages and disadvantages of this novel control method in comparison to classical frequency-domain-derived PID control are examined. One key advantage of this technique is that it obviates the need to derive an accurate system/plant model by utilizing measured data to iteratively solve for an optimal control solution. This optimization algorithm stems from the linear quadratic regulator (LQR) and involves the iterative solution of an algebraic Riccati equation (ARE). Simulation results implemented on a buck converter are provided to verify the effectiveness and examine the limitations of the proposed control strategy. The implementation of a classical Type-III compensator was also simulated to serve as a performance comparison to the proposed controller.