Canonical transformation ͑CT͒ theory provides a rigorously size-extensive description of dynamic correlation in multireference systems, with an accuracy superior to and cost scaling lower than complete active space second order perturbation theory. Here we expand our previous theory by investigating ͑i͒ a commutator approximation that is applied at quadratic, as opposed to linear, order in the effective Hamiltonian, and ͑ii͒ incorporation of the three-body reduced density matrix in the operator and density matrix decompositions. The quadratic commutator approximation improves CT's accuracy when used with a single-determinant reference, repairing the previous formal disadvantage of the single-reference linear CT theory relative to singles and doubles coupled cluster theory. Calculations on the BH and HF binding curves confirm this improvement. In multireference systems, the three-body reduced density matrix increases the overall accuracy of the CT theory. Tests on the H 2 O and N 2 binding curves yield results highly competitive with expensive state-of-the-art multireference methods, such as the multireference Davidson-corrected configuration interaction ͑MRCI+ Q͒, averaged coupled pair functional, and averaged quadratic coupled cluster theories.