A major challenge faced in the design of large-scale cyber-physical systems, such as power systems, the Internet of Things or intelligent transportation systems, is that traditional distributed optimal control methods do not scale gracefully, neither in controller synthesis nor in controller implementation, to systems composed of millions, billions or even trillions of interacting subsystems. This paper shows that this challenge can now be addressed by leveraging the recently introduced System Level Approach (SLA) to controller synthesis. In particular, in the context of the SLA, we define suitable notions of separability for control objective functions and system constraints such that the global optimization problem (or iterate update problems of a distributed optimization algorithm) can be decomposed into parallel subproblems. We then further show that if additional locality (i.e., sparsity) constraints are imposed, then these subproblems can be solved using local models and local decision variables. The SLA is essential to maintaining the convexity of the aforementioned problems under locality constraints. As a consequence, the resulting synthesis methods have O(1) complexity relative to the size of the global system. We further show that many optimal control problems of interest, such as (localized) LQR and LQG, H2 optimal control with joint actuator and sensor regularization, and (localized) mixed H2/L1 optimal control problems, satisfy these notions of separability, and use these problems to explore tradeoffs in performance, actuator and sensing density, and average versus worst-case performance for a large-scale power inspired system.