“…In the best-known BP formulation due to Rumelhart et al [3], x s , the vector of "beforenode" net inputs [see Equation (6)], is treated as the state vector, whereas in optimal control, y s , the vector of "afternode" outputs, is chosen as the state vector. , two hidden nodes (P 2 = 2), and three terminal outputs (F ≡ P 3 = 3); hence, 13 parameters in total including threshold parameters: (a) The desired block-arrow Hessian matrix, whose arrowhead should point downward to the right (see [4], [5]), with F (= 3) diagonal blocks; (b) A Hessian matrix with a complex sparse pattern, which is hard to exploit, obtained by the NETLAB (MATLABbased software) (see mlphess.m at http://www.ncrg.aston.ac.uk/netlab/). For large-scale optimization, it is not recommendable to approximate the inverse of the Hessian because it always becomes dense.…”