Long-Short Term Memory (LSTM) networks are able to learn the complicated relationships between variables from previous and current timesteps over time series data and use them to do specific forecast tasks. LSTMs are basically stacks of perceptron algorithms, the more stacks a neural network has, the deeper the neural network. There are two types of gradient propagations over LSTM networks -forward and backward. However there is a common vanishing issue when developing LSTM networks. This paper proposes two LSTM models developed with sequence-to-sequence and sequence-to-vector frames and investigates possible empirical solutions to the vanishing issue, particularly, in the context of predicting multivariate and univariate power demands through comparative evaluation on electricity demand data.