Secure multi-party computation (SMPC) allows mutually distrusted parties to evaluate a function jointly without revealing their private inputs. This technique helps organizations collaborate on a common goal without disclosing confidential or protected data. Despite its suitability for privacy-preserving computation, SMPC suffers from network-based performance limitations. Specifically, the SMPC parties perform the techniques in rounds, where they execute a local computation and then share their round output with the other parties. This network interchange creates a bottleneck as parties need to wait until the data propagates before resuming the execution. To reduce the SMPC execution time, we propose a pipelining-like approach for each round's computation and communication by dividing the data and readjusting the execution order. Targeting deep learning applications, we propose strategies for the case of matrix multiplication, a core component of such applications. Our results on a distributed cloud deployment show a significant reduction in the SMPC execution time.