Flip-flops (FFs) are the most commonly used sequential elements in synchronous circuits, but their timing requirements limit the operating frequency. Borrowing time with a latch-based approach can increase operating frequency, but traditional backend optimization tools struggle to manage hold time requirements. The Mix & Latch technique achieves higher frequencies and often lower area than commercial state-of-the-art retiming by exploiting four types of synchronous sequential gates, namely positive and negative edge-triggered FFs and positive and negative transparent latches, all using a single clock tree. In this paper we first significantly accelerate the Mix & Latch flow convergence with respect to past work, by using a post-synthesis-based timing analysis that eliminates the first placement and routing needed for post-layout timing analysis. Then, by adding tolerance margins to the timing model, the pessimism is reduced to improve both convergence speed and maximum frequency. Finally, we reduce the complexity of the problem by applying the methodology only to the sequential elements belonging to critical paths. The effectiveness of Mix & Latch is then demonstrated on a RISC-V processor core from the Pulp platform using 28 nm CMOS FDSOI technology. The results are compared to both the original Mix & Latch flow and a retiming performed with a state-of-the-art tool, showing a 25 % frequency improvement over the original flow and 7.5 % over the retiming flow. Compared to the retiming flow, we achieve comparable or lower power and area, while preserving the original registers and allowing logic equivalence checking.