Short-term load forecasting is a critical task in the smart grid, which can be used to optimize power deployment and reduce power losses. Recurrent neural networks (RNNs) are the most popular deep learning models for short-term load forecasting. However, despite of achieving better forecasting accuracy than the traditional models, the performance of the existing RNN-based load forecasting approaches is still unsatisfactory for practical usage. Therefore, in this work, we have proposed input attention mechanism (IAM) and hidden connection mechanism (HCM) to greatly enhance the accuracy and efficiency of RNNbased load forecasting models. Specifically, we use IAM to assign the importance weights on input layers, which have better performances in both efficiency and accuracy than traditional attention mechanisms. To further enhance the models' efficiency, HCM is then applied to utilize residual connections to enhance the model's converging speed. We have applied both IAM and HCM on four state-of-the-art RNN implementations, and then conducted extensive experimental studies on two public datasets. Experimental results show that the proposed RNNs with IAM and HCM models achieve much better performances than the state-of-the-art baselines in both accuracy and efficiency. Ablation studies show that both IAM and HCM are essential to achieve such superior performances.