The increasing demand for higher data rates motivates the exploration of advanced techniques for future wireless networks. To this end, massive multiple-input multiple-output (mMIMO) is envisioned as the most essential technique to meet this demand. However, the expansion of the number of antennas in mMIMO systems with short coherence time makes the downlink channel estimation (DCE) overhead potentially overwhelming. As such, the number of training sequence (TS) needs to be significantly reduced. However, reducing the number of TS reduces the mean-squared error (MSE) accuracy significantly and to date it is not clear to what extend can this TS reduction affects the achievable sum rate performance. Therefore, this paper develops a low complexity and tractable TS solution for DCE and establishes an analytical framework for the optimum TS. Furthermore, the tradeoff between the achievable sum rate maximization criteria and the MSE minimization criteria is investigated. This investigation is essential to characterize the optimum TS length and the actual performance of mMIMO systems when the channel exhibits a limited coherence time. To this end, the statistical structure of mMIMO channels is exploited. In addition, this paper utilizes a random matrix theory (RMT) method to characterize the downlink achievable sum rate and MSE in a closed-form. This paper shows that maximizing the downlink sum rate criterion is more important than minimizing the MSE of the SINR only, which is typically considered in the conventional MIMO systems and/or in the time division duplex (TDD) mMIMO systems. The results demonstrate that a feasible downlink achievable sum rate can be achieved in an frequency division duplex (FDD) mMIMO system. This finding is necessary to extend the benefit of mMIMO systems to high frequency bands such as millimeter-wave (mmWave) and Terahertz (THZ) communications.INDEX TERMS Massive MIMO transmission, downlink channel estimation, achievable sum rate maximization, frequency division duplex operation mode, second order channel statistics, random matrix theory, mean square error minimization.