As part of a new MPI Remote Memory Access (RMA) implementation over remote direct memory access, we propose a message scheduling scheme that interleaves inter-node and intra-node data transfers in a way that minimizes the overall latency of the RMA epoch. By doing so, we fulfill inter-node communication/intra-node communication overlapping and show that further latency mitigation is possible when computation length and communication payload sizes are too unbalanced to allow a satisfying level of communication/-computation overlapping. Test results show that the proposed scheduling scheme compares favorably against the approaches used in MVAPICH and Open MPI.