A tri-layer bus system-on-chip (SoC) and a butterfly-path accelerator are used to enhance system-level performance in a sequential minimal optimisation learning core. The tri-layer bus architecture is used to obtain an adequate transfer rate. The butterfly-path accelerator also uses symmetrical access to resolve bottlenecks during linear prediction cepstral coefficients extraction. This novel design increases speed and flexibility without substantially increasing area. For implementation in chip manufacturing, the SoC is synthesised, placed and routed using the TSMC 90 nm technology library. The die size is 2.09 mm × 2.09 mm, and the power consumption is 8.9 mW. Compared with the non-butterfly-path design, the simulation results show that the proposed architecture provides a 2.4-fold speed increase. In addition, clock down-sampling and voltage scaling reduce the power consumed by the proposed chip by a factor of 8.5. The experimental results confirm the improved speed and power that are provided by the proposed architecture and methods.