A general approach for the derivation of nonlinear parameterizations of neglected scales is presented for nonlinear systems subject to an autonomous forcing. In that respect, dynamically-based formulas are derived subject to a free scalar parameter to be determined per mode to parameterize. For each high mode, this free parameter is obtained by minimizing a cost functional -a parameterization defect -depending on solutions from direct numerical simulation (DNS) but over short training periods of length comparable to a characteristic recurrence or decorrelation time of the dynamics.An important class of dynamically-based formulas, for our parameterizations to optimize, are obtained as parametric variations of manifolds approximating the invariant ones. To better appreciate the origins of the modified manifolds thus obtained, the standard approximation theory of invariant manifolds is revisited in Part I of this article. A special emphasis is put on backward-forward (BF) systems naturally associated with the original system, whose asymptotic integration provides the leading-order approximation of invariant manifolds.Part II presents then (i) the modifications of these approximating manifolds based also on integration of the same BF systems but this time over a finite time τ , and (ii) the variational approach aimed at making an efficient selection of τ per mode to parameterize. The parametric class of leading interaction approximation (LIA) of the high modes obtained this way, is completed by another parametric class built from the quasi-stationary approximation (QSA); close to the first criticality, the QSA is an approximation to the LIA, but it differs as one moves away from criticality.Rigorous results are derived that show that -given a cutoff dimension -the best manifolds that can be obtained through our variational approach, are manifolds which are in general no longer invariant. The minimizers are objects, called the optimal parameterizing manifolds (PMs), that are intimately tied to the conditional expectation of the original system, i.e. the best vector field of the reduced state space resulting from averaging of the unresolved variables with respect to a probability measure conditioned on the resolved variables.Applications to the closure of low-order models of Atmospheric Primitive Equations and Rayleigh-Bénard convection are then discussed. The approach is finally illustrated -in the context of the Kuramoto-Sivashinsky turbulence -as providing efficient closures without slaving for a cutoff scale kc placed within the inertial range and the reduced state space is just spanned by the unstable modes, without inclusion of any stable modes whatsoever. The underlying optimal PMs obtained by our variational approach are far from slaving and allow for remedying the excessive backscatter transfer of energy to the low modes encountered by the LIA or the QSA parameterizations in their standard forms, when they are used at this cutoff wavelength.